10

I want to get the Content-Length value from the meta variable. I need to get the size of the file that I want to download. But the last line returns an error, HTTPMessage object has no attribute getheaders.

import urllib.request
import http.client

#----HTTP HANDLING PART----
 url = "http://client.akamai.com/install/test-objects/10MB.bin"

file_name = url.split('/')[-1]
d = urllib.request.urlopen(url)
f = open(file_name, 'wb')

#----GET FILE SIZE----
meta = d.info()

print ("Download Details", meta)
file_size = int(meta.getheaders("Content-Length")[0])
Mechanical snail
  • 29,755
  • 14
  • 88
  • 113
scandalous
  • 912
  • 5
  • 14
  • 25

6 Answers6

13

It looks like you are using Python 3, and have read some code / documentation for Python 2.x. It is poorly documented, but there is no getheaders method in Python 3, but only a get_all method.

See this bug report.

Krumelur
  • 31,081
  • 7
  • 77
  • 119
  • 1
    For the benefit of people from Google, it seems you can now do `file_size = int(d.getheader('Content-Length'))` in Python 3 (tested in 3.4.1). `d.getheaders()` also seems to have been added. – freshtop Jun 19 '14 at 01:21
  • 2
    @freshtop: Both `d.getheader()` and `d.getheaders()` work even on Python 3.2. Note: OP uses `d.info()` instead of `d` here. `d.info().getheader()` and `d.info().getheaders()` is Python 2 code. To support both Python 2 and 3, [`d.headers['Content-Length']` could be used](http://stackoverflow.com/a/31576222). – jfs Jul 23 '15 at 01:03
7

for Content-Length:

file_size = int(d.getheader('Content-Length'))
SilentGhost
  • 307,395
  • 66
  • 306
  • 293
nickanor
  • 637
  • 2
  • 12
  • 18
  • 1
    I think they are looking for a python3 solution, (at least I am and this is the top google hit) – ThorSummoner Apr 29 '14 at 05:56
  • 1
    @ThorSummoner: `d.getheader()` works on Python 3 only. The question has python-3.x tag and therefore Python 3 only solution is appropriate. – jfs Jul 23 '15 at 01:04
6

Change final line to:

file_size = int(meta.get_all("Content-Length")[0])
user2554726
  • 179
  • 2
  • 3
4

You should consider using Requests:

import requests

url = "http://client.akamai.com/install/test-objects/10MB.bin"
resp = requests.get(url)

print resp.headers['content-length']
# '10485760'

For Python 3, use:

print(resp.headers['content-length'])

instead.

K Z
  • 29,661
  • 8
  • 73
  • 78
  • +1, If you only expect one header, go with the item operator. However, I fear there is no `headers` attribute in Python3, so it should probably be `resp.get("Content-Length")` or maybe `resp["Content-Length"]` (didn't try this) – Krumelur Oct 21 '12 at 08:56
  • seems to be no requests libraries in python 3.2...think i should switch versions...which version you guys using ? – scandalous Oct 21 '12 at 09:02
  • @scandalous `Requests` recently added 3.3 support. I am running 2.7.3. – K Z Oct 21 '12 at 09:03
  • @Krumelur That wasn't an issue, as `resp` is a `Requests` response dict. There's one thing I need to change though.. it should be `print(resp.headers)` instead for Python3. – K Z Oct 21 '12 at 09:28
  • @scandalous You are welcome! I forgot to change `print` statement to python3's format in the original answer. – K Z Oct 21 '12 at 09:30
  • @KayZhu, yes of course. Overlooked that you had removed the `info()` call :) – Krumelur Oct 21 '12 at 10:50
  • @Krumelur ah ok, though I didn't really remove anything in my post edit. There was never a `info()` call, I suppose you meant you mislooked? :) – K Z Oct 21 '12 at 11:05
2

response.headers['Content-Length'] works on both Python 2 and 3:

#!/usr/bin/env python
from contextlib import closing

try:
    from urllib2 import urlopen
except ImportError: # Python 3
    from urllib.request import urlopen


with closing(urlopen('http://stackoverflow.com/q/12996274')) as response:
    print("File size: " + response.headers['Content-Length'])
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • This doesn't work if a header is repeated. You only get the first one when using the `headers` attribute. The *only* reliable way is to use `info().get_all()`. In Python2 `info().get()` would concatenate all duplicate headers but this fragile behavior has been removed for Py3. Unfortunately `get_all()` hasn't been backported to Py2 so we are stuck having to wrestle with this poorly documented library for more years to come. – Kevin Thibedeau Jul 20 '16 at 14:43
  • @KevinThibedeau: 1- [duplicate Content-Length headers with different values are not supported in http](https://tools.ietf.org/html/rfc7230#page-31) 2- `info()` is implemented as `return self.headers`. – jfs Jul 20 '16 at 15:34
  • From [RFC-6265](https://tools.ietf.org/html/rfc6265#section-3): "Origin servers SHOULD NOT fold multiple Set-Cookie header fields into a single header field". It is not at all unusual to receive duplicate headers. Python's libraries need to support this behavior properly. – Kevin Thibedeau Jul 20 '16 at 16:06
  • @KevinThibedeau: [Set-Cookie is a well-known exception -- you should not use it as an example for other http headers](https://tools.ietf.org/html/rfc7230#page-24). rfc7230 specifies the behavior for the Content-Length header explicitly (read the link from my previous comment). – jfs Jul 20 '16 at 16:16
0
import urllib.request

link = "<url here>"

f = urllib.request.urlopen(link)
meta = f.info()
print (meta.get("Content-length"))
f.close()

Works with python 3.x

Akshar
  • 927
  • 9
  • 7