0

Good day. I'm using idhttp in my Delphi application.

I wonder if it is possible I Getting a MD5 hash of a file Online :

eg

idhttp.get ( ' http.onedrive.com/arquive.rar ');

Is possible to return the MD5 of a file before downloading ... or just after I download could check the MD5 ?

For in php ... I use get_headers which returns some enteressantes data as Content- MD5 ... but almost any file owned this header ...

Ex php:

<?php $url = 'https://download3.vmware.com/software/player/file/VMware-player-6.0.4-2249910.exe';
echo '<pre>';
print_r(get_headers($url));

print_r(get_headers($url, 1));

?>



Array
(
    [0] => HTTP/1.0 200 OK
    [Server] => Apache
    [ETag] => "df0743bf13519b6c461d50fac0fa0ded:1414635035"
    [Content-MD5] => 3wdDvxNRm2xGHVD6wPoN7Q==
    [Last-Modified] => Thu, 30 Oct 2014 02:10:35 GMT
    [Accept-Ranges] => bytes
    [Content-Length] => 98906456
    [Date] => Tue, 25 Nov 2014 19:11:28 GMT
    [Connection] => close
    [Content-Disposition] => attachment; filename="VMware-player-6.0.4-2249910.exe"
    [Content-Type] => application/x-octet-stream
)

I can use this ETag header to see if the file is identical? But how to catch it with DELPHI ??

  [ETag] => "df0743bf13519b6c461d50fac0fa0ded: 1414635035"

abcd
  • 441
  • 6
  • 24
  • Are you looking for `TIdHTTP.Head`? See http://stackoverflow.com/questions/4962096/idhttp-just-get-response-code – Sir Rufo Nov 25 '14 at 19:24

3 Answers3

5

You can use TIdHTTP.Head() to retrieve just the file's headers without having to downloade the file itself, or you can use TIdHTTP.Get() to download the file and get its headers at the same time. Either method populates the TIdHTTP.Response sub-properties accordingly.

An ETag header (which you can read from the TIdHTTP.Response.ETag property) allows you to detect if a file has been changed on the server. When you download a file, you can save its ETag value as well, if one was provided. You can then use that value later to check if the server-side file has been changed since the last time you downloaded it.

HTTP has a feature for that purpose called Conditional GET. If you send a GET request with a If-None-Match header (you can use the TIdHTTP.Request.CustomHeaders property for that) specifying the ETag you already have, the server can send you a 304 Not Modified response if the server-side ETag still matches that value, so you know your copy of the file is up-to-date and the server does not waste time and bandwidth trying to send the file again.

In your example. the file also has a Content-MD5 header. That allows you to verify that the file you download was not modified/corrupted during the download process. You can use the TIdHTTP.Response.RawHeaders.Values['Content-MD5'] property to read that value, use the TIdDecoderMIME class to decode the value into bytes (it is base64 encoded), and use the TIdHashMessageDigest5 class to calculate an MD5 hash of your local/downloaded file and compare it to the Content-MD5 hash.

Community
  • 1
  • 1
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • Very good explanation ... But I see is almost impossible to get Content-MD5, for most downloadable files do not have this header ... then return it empty = "" (empty) ... The only ways I see that maybe can work, but will never be 100%, is to use the Headers: ETag, Last-Modified and Content-Length ... Not 100%, but would help to get the files, since the Content-MD5 is always emptiness ... How can I be about 200 files for download, the best way is the person downloading the file, and I check its MD5 and compare with one that I have stored in the database ... – abcd Nov 25 '14 at 23:59
  • Not all servers support `ETag`, either. Sometimes all you have to work with is `Last-Modified`. Even `Content-Length` is not always used, if `Transfer-Encoding: chunked` or `Content-Type: multipart/...` is used (such as `multipart/byterange`). Use whatever headers are available, but worse case you may have to just rely on MD5 hashes of your local copy of the files. – Remy Lebeau Nov 26 '14 at 00:11
1

Yes. ETags are used for caching. This particular one uses MD5 and the timestamp, but on the PHP documentation you can also find etags such as 3f80f-1b6-3e1cb03b. The df0743bf13519b6c461d50fac0fa0ded string you see up front is the MD5 of the file, where the 1414635035 is the UNIX timestamp of the file's last modify date.

Enter the number at the end at a site such as this website and you'll see that the date that comes out is equal to the "Last-Modified" value.

Ruben Rutten
  • 1,659
  • 2
  • 15
  • 30
  • Good explanation, I could understand what it is for this header ... in some cases:  [5] => Last-Modified: Tue, 15 Feb 2011 23:40:45 GMT  [6] => ETag: "cb32dac369cdcb1: 0" – abcd Nov 26 '14 at 00:03
1

You can use Head request to get the information up front:

var
  ETag: string;

idhttp.head ('http.onedrive.com/arquive.rar');
ETag := idhttp.Response.ETag;
Dalija Prasnikar
  • 27,212
  • 44
  • 82
  • 159