When you use urlopen
, you are going requesting the whole contents (an HTTP GET request) so looking for the optional content-length header is not all that useful, once you've gone that way (it's OK, saves you some time and memory, but you have imposed avoidable load on the server and network). Still, as the existing answer indicates, the len
of the read()
of the urlopen
's result is the way that will work even if content-length is missing.
Alas, urllib2 does not support the HEAD http method. To try HEAD, you have to use the lower-level module httplib (make a Connection to the server, call its request('HEAD', url)
method, call its getresponse
to get an HttpResponse object, call the getheader
method on the latter to get the content length header... you see why I say the module is lower-level;-). If you're dealing with very large pages, and sensible servers (ones that do set the content length header), this, while messy, could be an important optimization.