I'm trying to dowload a file using Python 3's urllib
, but I get some html garbage instead of the actual file. However, if I use the browser, I can download the file just fine. A minimum non-working example:
import urllib.request
url = 'https://contrataciondelestado.es/wps/wcm/connect/PLACE_es/Site/area/docAccCmpnt?srv=cmpnt&cmpntname=GetDocumentsById&source=library&DocumentIdParam=ecd194a4-82e1-4fd2-8135-616622234f9b'
urllib.request.urlretrieve(url,'blah.pdf')
I also tried the two answers in this thread (creating an user agent and using the requests
module)...but the same nothing.
Using requests
import requests
url = 'https://contrataciondelestado.es/wps/wcm/connect/PLACE_es/Site/area/docAccCmpnt?srv=cmpnt&cmpntname=GetDocumentsById&source=library&DocumentIdParam=ecd194a4-82e1-4fd2-8135-616622234f9b'
r = requests.get(url, allow_redirects=True)
with open('test.pdf', 'wb') as f:
f.write(r.content)
print(r.is_redirect)
Same gibberish, and the requests
module says the passed URL is not a redirect.
I also tried more "sophisticated" stuff like the download_file
function suggested here......same old.
Any clue?
Cheers.