2

I've got a large file somewhere (FTP/HTTP).

I want to

  1. Download first N bytes,
  2. Check its header which is embedded into the file (whether the version differs)
  3. Then decide whether to proceed with or abort the download.

It's definitely not such a straightforward task as I've imagined (to my surprise). Even calling wget/curl externally doesn't seem to be a good solution (Maybe I overlooked the right command line option).

How could this be done as simple as possible in Python?

I'm thinking about a custom handler for ftp.retrbinary which will raise an exception as soon as the sum of blocks will be above defined value, but it's overkill in my eyes. Python code is supposed to be elegant, right?

Miro Kropacek
  • 2,742
  • 4
  • 26
  • 41
  • 1
    Do you need just the headers? Take a look at [`requests`](http://python-requests.org/); it let's you stream a request, where you can grab just the headers and then close the connection if you wanted too. *Or* read the first N bytes of the response body too, if you wanted too. – Martijn Pieters Jan 31 '13 at 10:40
  • possible duplicate of [Download file using partial download (HTTP)](http://stackoverflow.com/questions/1798879/download-file-using-partial-download-http) – Kien Truong Jan 31 '13 at 11:03
  • Dikei: I've seen the answer but it doesn't cover FTP, I need this, too. – Miro Kropacek Jan 31 '13 at 11:12

1 Answers1

2

If you want to check just the headers, send an HTTP Head, rather than a GET. It will return the same headers as a GET, with no message body.

The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request.

A HEAD can be sent as detailed here.


EDIT:

If you do need the first N bytes, you could use urllib2 in conjunction with the Range header. Range: bytes=0-N.

Community
  • 1
  • 1
Anirudh Ramanathan
  • 46,179
  • 22
  • 132
  • 191