Download first N bytes of a file in python

Question

I've got a large file somewhere (FTP/HTTP).

I want to

Download first N bytes,
Check its header which is embedded into the file (whether the version differs)
Then decide whether to proceed with or abort the download.

It's definitely not such a straightforward task as I've imagined (to my surprise). Even calling wget/curl externally doesn't seem to be a good solution (Maybe I overlooked the right command line option).

How could this be done as simple as possible in Python?

I'm thinking about a custom handler for ftp.retrbinary which will raise an exception as soon as the sum of blocks will be above defined value, but it's overkill in my eyes. Python code is supposed to be elegant, right?

Do you need just the headers? Take a look at [`requests`](http://python-requests.org/); it let's you stream a request, where you can grab just the headers and then close the connection if you wanted too. *Or* read the first N bytes of the response body too, if you wanted too. — Martijn Pieters, Jan 31 '13 at 10:40
possible duplicate of [Download file using partial download (HTTP)](http://stackoverflow.com/questions/1798879/download-file-using-partial-download-http) — Kien Truong, Jan 31 '13 at 11:03
Dikei: I've seen the answer but it doesn't cover FTP, I need this, too. — Miro Kropacek, Jan 31 '13 at 11:12

score 2 · Answer 1 · edited May 23 '17 at 11:49

2

If you want to check just the headers, send an HTTP Head, rather than a GET. It will return the same headers as a GET, with no message body.

The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request.

A HEAD can be sent as detailed here.

EDIT:

If you do need the first N bytes, you could use urllib2 in conjunction with the Range header. Range: bytes=0-N.

edited May 23 '17 at 11:49

Community

1
1

answered Jan 31 '13 at 10:41

Anirudh Ramanathan

46,179
22
132
191

1

Sounds like Miro's more interested in a header embedded in the file, rather than the HTTP headers. – Miles Jan 31 '13 at 11:05
@MiroKropacek You could use the `Range` header in that case, in conjunction with GET. – Anirudh Ramanathan Jan 31 '13 at 11:19

Download first N bytes of a file in python

1 Answers1