0

I've looked at the links suggested by SO but none seem relevant so here goes..

(In the following the actual url and header data has been obfuscated)

I would really appreciate some help to automate the download of data from a https web page, using Delphi and (preferably) Twebbrowser as I'm not having a lot of success with Indy.

In Firefox I navigate to the url: https://www.thedomain.com/folder1/Showdata.aspx?code=111 that shows some data, there I can click an image that will download the data as csv.

I am trying to automate that process using Twebbrowser in Delphi by navigating to the url and then clicking that image, or by programatically calling the same function that the image does..

The web page shows the image with the anchor text...

<a class="button dlCSV" href="javascript:void(0);" onclick="MyLib.DownloadCsv();return false;">Download</a>

It looks like clicking this causes a POST to

https://www.thedomain.com/folder1/AnotherFolder/Sendcsv.ashx

(with a load of headers) that actually does the download.

I first tried using TWebbrowser to navigate to

https://www.thedomain.com/folder1/Showdata.aspx?code=111

and then clicked the button manually in the Twebbrowser window. That downloaded an empty file called Sendcsv.ashx.

Then I tried programatically clicking the image using code found here: http://www.experts-exchange.com/Programming/Languages/Pascal/Delphi/Q_27493399.html. That didn't seem to do anything. It found the image and called its click method but nothing was downloaded and no error message was shown.

Can anyone help me with the code needed to download this data using Delphi?


Additional info that might be useful.

Using the network tab in the debugger in Firefox I managed to find out that clicking the image caused a POST to https://www.thedomain.com/folder1/AnotherFolder/Sendcsv.ashx

and there were loads of headers shown as follows. However I don't know how I write the code that will send these headers to Sendcsv.ashx (if that's what I need to do!) .

Response headers

  • Cache-Control:"max-age=0"
  • Content-Disposition:"attachment; filename=thedata.csv"
  • Content-Encoding:"gzip"
  • Content-Length:"6050"
  • Content-Type:"text/html; charset=utf-8"
  • Date:"Tue, 03 Nov 2014 10:05:43 GMT"
  • Pragma:"public"
  • Server:"Microsoft-IIS/8.0"
  • X-AspNet-Version:"4.0.30319"
  • X-Powered-By:"ASP.NET"

Request headers

  • Host:"www.thedomain.com" User-Agent:"Mozilla/5.0 (Windows NT 6.1;
  • WOW64; rv:33.0) Gecko/20100101 Firefox/33.0"
  • Accept:"text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8"
  • Accept-Language:"en-GB,en;q=0.5" Accept-Encoding:"gzip, deflate"
  • Referer:"https://www.thedomain.com/folder1/Showdata.aspx?code=111"
  • Cookie:"rememberme=True" Connection:"keep-alive"
  • Content-Type:"application/x-www-form-urlencoded"
  • Content-Length:"15972"

Request headers from upload stream

  • Content-Type:"application/x-www-form-urlencoded"
  • Content-Length:"15972"
user3209752
  • 619
  • 2
  • 17
  • 29
  • `I'm not having a lot of success with Indy` - Indy can do a HTTP POST so it should work. Also it supports Cookies and TLS (HTTPS). So I would give Indy a second try. – mjn Nov 04 '14 at 12:13
  • I would but I understand how to use Indy even less that using Twebbrowser. Spent a fortnight failing to get it to download a pdf correctly before finding that the Indy version I have has a bug and the newest is incompatible with another component I have. After that I sort of gave up on Indy and now use Overbyte ICS for direct web/email stuff. If this file download can be done with that it would be handy. – user3209752 Nov 04 '14 at 12:54
  • 1
    Have you looked into this SO post? [File Download by Calling .ashx page](http://stackoverflow.com/questions/12087040/file-download-by-calling-ashx-page). No javascript required. Try to keep things simple – Chris Nov 04 '14 at 16:49
  • @Chris Yes I did read that one and although it seemed to be addressing the issue I didn't really understand the answer. ie how to use Delphi and Twebbrowser to do what it was suggesting. I also read [link]( http://stackoverflow.com/questions/10912164/what-is-the-best-way-to-download-file-from-server) but I had the same issue. I also don't really know what - if any- the ashx is expecting by way of parameters/headers. Hence the post here. – user3209752 Nov 04 '14 at 19:22
  • So when push comes to show you want to call the Javascript function MyLib.DownloadCsv() on thw website. From within Delphi ? – Jens Borrisholt Nov 05 '14 at 07:01

1 Answers1

1

Lately, for situations like this, I've been using the MSXML2_TLB.pas unit, obtained by importing the Microsoft XML type library.

It has a class XMLHTTP, which does exactly what a XmlHttpRequest does, like you might recognize from web-development.

uses SysUtils, ActiveX, AxCtrls, MSXML2_TLB;

//be sure to call CoInitialize(nil); on application or thread startup

//procedure ...
var
  reqData:string;
  req:XMLHTTP;
  str:TOleStream;
  f:TFileStream;
begin
  //Did you catch the request data? load/set/build it here:
  //reqData:=

  req:=CoXMLHTTP.Create;
  req.open('POST','https://www.thedomain.com/folder1/AnotherFolder/Sendcsv.ashx',false,'','');
  req.setRequestHeader('Host','www.thedomain.com');
  // try first without this one: it may not be required:
  //req.setRequestHeader('Referer','https://www.thedomain.com/folder1/Showdata.aspx?code=111');
  // same here, may not affect response
  //req.setRequestHeader('Cookie','rememberme=True');
  req.setRequestHeader('Content-Type','application/x-www-form-urlencoded');
  req.setRequestHeader('Content-Length',IntToStr(Length(reqData)));
  req.send(reqData);

  if req.status=200 then
   begin
    str:=TOleStream.Create(IUnknown(req.responseStream) as IStream);
    try
      f:=TFileStream.Create('thedata.csv',fmCreate);//or TMemoryStream?
      try
        f.CopyFrom(str,str.Size);
      finally
        f.Free;
      end;
    finally
      str.Free;
    end;
   end
  else
    raise Exception.Create('CSV request failed: '+req.statusText);//parse req.responseText?
end;
Stijn Sanders
  • 35,982
  • 11
  • 45
  • 67
  • Thank you. Fom that I can see the sort of operations I need to do. However, after a great deal of fiddling around with Delphi 2009 I think I managed to import the type libray - its not obvious how to do that or why - but at least it doesn't complain about the USESActiveX, AxCtrls, MSXML2_TLB anymore. However it still says XMLHTTP is an unknown identifier. (Is is simpler just to add a new unit and copy/paste the codes from here https://code.google.com/p/omnixml/source/browse/trunk/MSXML2_TLB.pas into it? Would that have the same effect as importing libraries and maybe make XMLHTTP available ?) – user3209752 Nov 05 '14 at 09:15
  • One of the copies I use is here: https://github.com/stijnsanders/xxm/blob/master/Delphi/common/MSXML2_TLB.pas – Stijn Sanders Nov 05 '14 at 18:54
  • Thanks Stijn, Thats looks like the same code but actually an earlier version, so if that works for you I guess the one in my link above should work for me. - Once I get a handle on how to use it! I may appear dumb but I'm not all that familiar with web site stuff - I tended to avoid it. My background research is in AI and Neural Networks but my current specialism is in mathematical data analysis. Web sites, java, HTML etc is all a bit vague! – user3209752 Nov 05 '14 at 22:55