Download .gz file using requests in Python Error

Question

I would be very grateful if anyone could help me with this issue I am having.

I’m trying to use the request lib to download a .gz file from the internet. I have successfully used the lib before to get xml data that is parsed to the browser, but the .gz version is not working.

Once the URL_To_Gzip link is clicked in my browser, the .gz file automatically starts to download the file. --> so the url is ok, but just points directly to the file.

I’m trying to code this in python 2.7 so I can then process the file and data it contains, but I get an error message that I am struggling to resolve.

Error Message:

HTTPSConnectionPool(host=HOST_URL_TO_GZip, port=443): Max retries exceeded with url: URL_TO_GZip.gz (Caused by : [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond)

import requests 

data = requests.get(url_to_gzip,proxies = {"http":proxy_url}) # Does not work data = #Does not work

data = requests.get(url_to_gzip,proxies = {"http":proxy_url}, stream = True) # Does not work

The information on Errno 10060 suggests the error is related to my proxy, as a connection can not be established. --> But I have successfully used these to get the xml data in a similar version.

Thanks,

Ravi

EDIT

The URL_TO_GZip.gz file is via a https:// whereas the xml file that works ok is via a http:// which I think is the cause of my problem and why it works for one file but not another.

score 0 · Accepted Answer · answered Feb 27 '15 at 19:50

0

For anyone else that comes across this issue, I needed to add a auth = (username, password) keyword to access the HTTPS site auth keyword.

answered Feb 27 '15 at 19:50

12avi

465
1
4
7

Download .gz file using requests in Python Error

1 Answers1