I use this script to download a json about once a minute and save to a unique file name. Sometimes it just hangs. It successfully saves the file that the printed line indicates, but then just waits for hours until I notice.
My question is 1) is there maybe something obvious I haven't though of to add (some kind of time-out?) and 2) is there something I can do to find out where it is stuck, when it is stuck? (other than putting a print line in between every other line)
If the internet connection is unresponsive, I see the "has FAILED" line about once a minute as expected, until the internet connection is working again, so that doesn't seem to be the problem.
Note: I save and load the n
this way to be a little robust against random crashes, restarts, etc.
import json
import urllib2
import numpy as np
n = np.load("n.npy")
print "loaded n: ", n
n += 10 # leave a gap
np.save("n", n)
print "saved: n: ", n
url = "http:// etc..."
for i in range(10000):
n = np.load("n.npy")
n += 1
try:
req = urllib2.Request(url, headers={"Connection":"keep-alive",
"User-Agent":"Mozilla/5.0"})
response = urllib2.urlopen(req)
dictionary = json.loads(response.read())
filename = "info_" + str(100000+n)[1:]
with open(filename, 'w') as outfile:
json.dump(dictionary, outfile)
np.save("n", n)
print "n, i = ", n, i, filename, "len = ", len(dictionary)
except:
print "n, i = ", n, i, " has FAILED, now continuing..."
time.sleep(50)