I have the problem, that my code to download files from urls using requests
stalls for no apparent reason. When I start the script it will download several hundred files, but then it just stops somewhere. If I try the url manually in the browser, the image loads w/o problem. I also tried with urllib.retrieve, but had the same problem. I use Python 2.7.5 on OSX.
Following you find
- the code I use,
- the stacktrace (
dtruss
), while the program is stalling and - the traceback, that is printed, when I
ctrl-c
the process after nothing happend for 10mins and
Code:
def download_from_url(url, download_path):
with open(download_path, 'wb') as handle:
response = requests.get(url, stream=True)
for block in response.iter_content(1024):
if not block:
break
handle.write(block)
def download_photos_from_urls(urls, concept):
ensure_path_exists(concept)
bad_results = list()
for i, url in enumerate(urls):
print i, url,
download_path = concept+'/'+url.split('/')[-1]
try:
download_from_url(url, download_path)
print
except IOError as e:
print str(e)
return bad_result
stacktrace:
My-desk:~ Me$ sudo dtruss -p 708
SYSCALL(args) = return
Traceback:
318 http://farm1.static.flickr.com/32/47394454_10e6d7fd6d.jpg
Traceback (most recent call last):
File "slow_download.py", line 71, in <module>
if final_path == '':
File "slow_download.py", line 34, in download_photos_from_urls
download_path = concept+'/'+url.split('/')[-1]
File "slow_download.py", line 21, in download_from_url
with open(download_path, 'wb') as handle:
File "/Library/Python/2.7/site-packages/requests/models.py", line 638, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/Library/Python/2.7/site-packages/requests/packages/urllib3/response.py", line 256, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/Library/Python/2.7/site-packages/requests/packages/urllib3/response.py", line 186, in read
data = self._fp.read(amt)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 567, in read
s = self.fp.read(amt)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
KeyboardInterrupt