Recent

Author Topic: Get access to the HTML-code in a Google Chrome Window or Microsoft Edge window  (Read 6499 times)

MortenB

  • Jr. Member
  • **
  • Posts: 59
What I need to do can be easily described as:
1. Open Browser with the needed webpage and URL-parameters.
2. Simply get hold of the HTML, preferably add this to an AnsiString, but Save the page to disk would also work, If I can choose path and filename.
3. Close that particular Browser Window.

With Lazarus,
I can easily do 1.
I am lost on 2 and 3.

As some of the web pages I am trying to get the HTML from are frequently in use, the server may be slow, which may call for a method of checking if the web page has finished loading, or have stopped due to some error.
Is there a refresh-possibility if loading stops.

I have tried using various methods to download directly, but all fail, mainly because of the webpage being HTTPS not HTTP.
I can download some HTTP-pages, but even here not all.
Maybe if I could activate SSL in Synapse and download directly, that would be great, but I have problems getting that to work.
I don't even know if SSL is all that is needed. Maybe there is a way to check for this in the webpage-code?

As a possible workaround, I would like to interact with Chrome or MS Edge directly, but I am unsure of what I need to make this happen, if this is even possible.

I would greatly appreciate thoughts and ideas towards this.
Best regards and thank you in advance,
Morten

Soner

  • Sr. Member
  • ****
  • Posts: 305
I think it is not possible to use Google Chrome or M$ Egde with Lazarus but you can use Internet Explorer:
http://wiki.freepascal.org/LazActiveX
(Look the section: "Example: Internet Explorer in a form with event support.")
When you need help for this then google for  "TWebbrowser delphi"

Have you tried with fphttpclient?
http://wiki.freepascal.org/fphttpclient
"... Since April 2014, the trunk/development fphttpclient supports SSL/TLS connections using the OpenSSL library .."


MortenB

  • Jr. Member
  • **
  • Posts: 59
Thank you very much for this first tip with LazActiveX!
I followed the examples, and for the first time, I can actually open the needed web-page using Lazarus.
I can also retrieve the needed HTML :)
there are just a few problems along this road.
When opening the main page for the needed web page, I receive script-errors which I have to click away manually.
The good part is that I can still log onto the remote site.
And then the display of the webpage is not that great
another 2 script errors which I have to click away manually
and then I can get hold of the HTML...
...
I suppose.
I MUST log on, this can not be avoided, so the first script errors must remain.
But from then on, I don't think I need the scripts running, I don't even need to display the page, only update the URL, and retrieve the returned HTML.

I have more testing to do, but this as an update!
Will look more for TWebbrowser and Delphi too.
Thanks!

Soner

  • Sr. Member
  • ****
  • Posts: 305
..
And then the display of the webpage is not that great
..
Default your programm has Internet Explorer 6.5. Mode. You can get better view if you tell windows that you want latest Internet Explorer. Copy-paste-save the next lines as *.reg-FIle and open with explorer to enter it to registry:
Replace your program name with 'YOUR_PROGRAM_NAME.exe'. Only Programm file name, no path!
When you have Internet Explorer 11:

copy from next line ---->
[HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main\FeatureControl\FEATURE_BROWSER_EMULATION]
"YOUR_PROGRAM_NAME.exe"=dword:00002AF8
<---until this line
This was stupid, so I added the file as attachment. :)


Here is the possible hex-vlaues after dword:

Hexvalue            Internet Explorer version
00002AF8           IE11
00002710   IE10
00002328   IE9
00001f40           IE8



... 2 script errors..
You can tell TWebbrowser to be quiet, then you get no more script error messages.
 You can search in internet for "DLCTL_SILENT and invoke"

Maybe I can make my old Webbrowser-Control simple and convert it to lazarus and post it here.
« Last Edit: August 04, 2018, 09:08:55 pm by Soner »

Thaddy

  • Hero Member
  • *****
  • Posts: 14201
  • Probably until I exterminate Putin.
Maybe if I could activate SSL in Synapse and download directly, that would be great, but I have problems getting that to work.
Here are four simple, cross-platform and working examples that all three also allow for redirects.(The dreaded 301)

Provided openssl is installed correctly this is all you need with synapse:
Code: Pascal  [Select][+][-]
  1. program gethtmlpageA;
  2. {$ifdef fpc}{$mode delphi}{$H+}{$endif}
  3. uses
  4.   classes, httpsend, ssl_openssl, ssl_openssl_lib;
  5. var
  6.   http:Thttpsend;
  7.   Resp:Tstrings;
  8. begin
  9.  Resp := TStringlist.Create;
  10.  http := THttpsend.create;
  11.  try
  12.    if Http.HTTPMethod('GET', 'https://freepascal.org/') then
  13.    begin
  14.      Resp.LoadFromStream(Http.Document);
  15.      writeln(Resp.Text); // or save it directly. See the next gethtmlpageB example
  16.    end;
  17.  finally
  18.    Http.Free;
  19.    Resp.Free;
  20.  end;
  21. end.
Or:
Code: Pascal  [Select][+][-]
  1. program gethtmlpageB;
  2. {$ifdef fpc}{$mode delphi}{$H+}{$endif}
  3. uses
  4.   classes, httpsend, ssl_openssl, ssl_openssl_lib;
  5. var
  6.   http:Thttpsend;
  7. begin
  8.  http:=THttpsend.create;
  9.  try
  10.    if Http.HTTPMethod('GET', 'https://freepascal.org/') then
  11.      http.Document.SaveToFile('freepascal.org.html');
  12.  finally
  13.    Http.Free;
  14.  end;
  15. end.
Or this simple version:
Code: Pascal  [Select][+][-]
  1. program gethtmlpagesimple;
  2. {$ifdef fpc}{$mode delphi}{$H+}{$endif}
  3. uses
  4.   classes, httpsend, ssl_openssl, ssl_openssl_lib;
  5. var list:Tstrings;
  6. begin
  7.   List :=TStringlist.Create;
  8.   try
  9.     if httpgettext('https://freepascal.org/',List) then writeln(list.text);
  10.   finally
  11.     List.Free;
  12.   end;
  13. end.

You can also do it purely with FPC provided libaries from fcl-web and fcl-net, again provided openssl is correctly installed:
Code: Pascal  [Select][+][-]
  1. program gethtmlpage2;
  2. {$ifdef fpc}{$mode delphi}{$H+}{$endif}
  3. uses classes,fphttpclient,fpopenssl, openssl;
  4. var
  5.   Client: TFPHTTPClient;
  6.   res: string;
  7. begin
  8.   InitSSLInterface;
  9.   Client := TFPHTTPClient.Create(nil);
  10.   try
  11.     Client.AllowRedirect := true; // better set this for e.g. google etc.
  12.     res := Client.Get('https://freepascal.org/');
  13.     writeln(res); //or save it
  14.   finally
  15.     Client.Free;
  16.   end;
  17. end.
 

Examples 1A and B and 3 can later be expanded if you are  also interested in e.g. the response headers. Example 2 is the simplest.
fcl-web and fcl-net are best maintained and are core packages. Synapse code is also compatible with Delphi if you need that.
« Last Edit: August 05, 2018, 11:58:38 am by Thaddy »
Specialize a type, not a var.

MortenB

  • Jr. Member
  • **
  • Posts: 59
Thank you ALL very much for your time and effort giving me super good ideas here!
I do have a working version with Internet Explorer inside a window now.
To silence the script-error was just a bit difficult even when looking at examples of how to do this in the Delphi Community.

the command:     

Browser.ComServer.Silent:=TRUE;

did the trick.

I will check out how to enable the IE 11 with the registry "hack".

Again. Thank you all!
Since I have a functional piece of software atm, I want to finish this, and THEN I would like to start on using Synapse or instead FPC provided libaries from fcl-web and fcl-net. I just have to figure out how to provide username and password on the first page I download... Well. That is for next month or so :)

Thaddy

  • Hero Member
  • *****
  • Posts: 14201
  • Probably until I exterminate Putin.
Thank you ALL very much for your time and effort giving me super good ideas here!
I do have a working version with Internet Explorer inside a window now.
To silence the script-error was just a bit difficult even when looking at examples of how to do this in the Delphi Community.

the command:     

Browser.ComServer.Silent:=TRUE;

did the trick.

I will check out how to enable the IE 11 with the registry "hack".

Again. Thank you all!
Since I have a functional piece of software atm, I want to finish this, and THEN I would like to start on using Synapse or instead FPC provided libaries from fcl-web and fcl-net. I just have to figure out how to provide username and password on the first page I download... Well. That is for next month or so :)
Yes, that does the trick, but it draws in *a lot of* code.
When you examine my first and third examples, you will notice that those already give you the option of setting username and password. (Look at the sourcecode)
But if it works, it's ok for now.
Specialize a type, not a var.

MortenB

  • Jr. Member
  • **
  • Posts: 59
..
And then the display of the webpage is not that great
..
Default your programm has Internet Explorer 6.5. Mode. You can get better view if you tell windows that you want latest Internet Explorer. Copy-paste-save the next lines as *.reg-FIle and open with explorer to enter it to registry:
Replace your program name with 'YOUR_PROGRAM_NAME.exe'. Only Programm file name, no path!
When you have Internet Explorer 11:

... 2 script errors..

This works wonders. Now tested!
The display of the needed website is now just perfect. Not that it is that important for this project, but it does provide me with tons of new ideas!
I added system timers in order to wait for the web pages to load, then when they are loaded, I can extract the HTML and read out the information I need.

Thank you so much!

 

TinyPortal © 2005-2018