How to get image file size in python when fetching from URL (before deciding to save)

Question

import urllib.request,io
url = 'http://www.image.com/image.jpg'

path = io.BytesIO(urllib.request.urlopen(url).read())

I'd like to check the file size of the URL image in the filestream path before saving, how can i do this?

Also, I don't want to rely on Content-Length headers, I'd like to fetch it into a filestream, check the size and then save

Possible duplicate: http://stackoverflow.com/questions/5909/get-size-of-a-file-before-downloading-in-python — Ihor Pomaranskyy, Mar 30 '15 at 07:43
Why the need to not rely on Content-Length headers? You can check the size of a `BytesIO` object the same way you can with any open file object, using seeking to the end and `fobj.tell()`. But if you use the Content-Length headers you can *prevent having to read the whole image into memory first*. — Martijn Pieters, Mar 30 '15 at 08:32

itzMEonTV · Answer 1 · 2015-03-30T09:21:31.010

2

Try importing urllib.request

import urllib.request, io
url = 'http://www.elsecarrailway.co.uk/images/Events/TeddyBear-3.jpg'
path = urllib.request.urlopen(url)
meta = path.info()

>>>meta.get(name="Content-Length")
'269898' # ie  269kb

edited Mar 30 '15 at 09:21

answered Mar 30 '15 at 07:51

itzMEonTV

19,851
4
39
49

But now your answer is functionally no different from llogiq and goes directly against what the OP is asking for. – Martijn Pieters Mar 30 '15 at 09:19

score 2 · Answer 2 · answered Mar 30 '15 at 08:38

You can get the size of the io.BytesIO() object the same way you can get it for any file object: by seeking to the end and asking for the file position:

path = io.BytesIO(urllib.request.urlopen(url).read())
path.seek(0, 2)  # 0 bytes from the end
size = path.tell()

However, you could just as easily have just taken the len() of the bytestring you just read, before inserting it into an in-memory file object:

data = urllib.request.urlopen(url).read()
size = len(data)
path = io.BytesIO(data)

Note that this means your image has already been loaded into memory. You cannot use this to prevent loading too large an image object. For that using the Content-Length header is the only option.

If the server uses a chunked transfer encoding to facilitate streaming (so no content length has been set up front), you can use a loop limit how much data is read.

score 0 · Answer 3 · answered Mar 30 '15 at 07:47

You could ask the server for the content-length information. Using urllib2 (which I hope is available in your python):

req = urllib2.urlopen(url)
meta = req,info()
length_text = meta.getparam("Content-Length")
try:
      length = int(length_text)
except:
      # length unknown, you may need to read
      length = -1

How to get image file size in python when fetching from URL (before deciding to save)

3 Answers3