Why does this image-download-script not work?

Question

I have a code and a question.

import string
import random
import httplib
import urllib
import os
import sys
import inspect

def id_generator(size=5, chars=string.ascii_letters + string.digits):
    return ''.join(random.choice(chars) for _ in range(size))


picnumber = raw_input('Please enter the amount of images you want!')
nothingfoundnumber=0
foundnummer=0

scriptpath = os.path.dirname(sys.argv[0])
filename = scriptpath + "/output/"
if not os.path.exists(os.path.dirname(filename)):
    os.makedirs(os.path.dirname(filename))


while foundnummer != picnumber:
    randompicstring = id_generator()
    print "Trying " + str(randompicstring)
    try:
        urllib.urlretrieve("http://i.imgur.com/" +randompicstring+ ".gif", "/test/" +randompicstring + ".gif")
        foundnummer+=1
        print str(randompicstring) + "was found! Makes " +str(foundnummer)+ " out of " +str(picnumber)+"!"
    except IOError:
        nothingfoundnumber+=1
        print str(randompicstring) + "not found. It was the "+str(nothingfoundnumber)+" try."

The purpose of this is to try random combinations of asciiletters and numbers to find images on imgur.com (e.g. https://i.stack.imgur.com/onlof.png). If it finds something it should say that and save the image and increase the foundnumber. If it doesn't find an image it should say that and increase the nothingfoundnumber.

Right now it doesn't work, it just says it always finds something and saves nothing. Can someone help me fixing this?

url retrieve returns a tuple (filename, httplib.HTTPMessage instance). Your 404 is in the latter. Please see http://stackoverflow.com/questions/1308542/how-to-catch-404-error-in-urllib-urlretrieve — DrV, Jun 14 '14 at 20:14

score 1 · Answer 1 · answered Jun 14 '14 at 20:16

1

You should also probably look at using the Imgur API rather than generating random strings. It looks like there is an endpoint for random images.

answered Jun 14 '14 at 20:16

chrisb

49,833
8
70
70

score 0 · Answer 2 · answered Jun 14 '14 at 20:11

This is probably because urlretrieve does not raise an exception when an error 404 occurs. You can try urlopen before urlretrieve to see if this is a 404 or not:

randompicstring = id_generator()
print "Trying " + str(randompicstring)
url = "http://i.imgur.com/" +randompicstring+ ".gif"

res = urllib.urlopen(url) # test url
if res.getcode() == 200: # valid link
    try:
        urllib.urlretrieve(url, "/test/" +randompicstring + ".gif") # download
        foundnummer+=1
        print str(randompicstring) + "was found! Makes " +str(foundnummer)+ " out of " +str(picnumber)+"!"
    except IOError:
        print "IOError..."
else: # invalid link
    nothingfoundnumber+=1
    print str(randompicstring) + "not found. It was the "+str(nothingfoundnumber)+" try."

Thanks, but it still doesn't work right. Sample output: `mxNDlwas found! Makes 76 out of 10! Trying NfUsn NfUsnwas found! Makes 77 out of 10! Trying jVF0e` There are still no images saved. — Woelfi, Jun 14 '14 at 20:22
It seems that imgur returns a `200` code even if the image does not exist... In that case, I think @chrisb has the right answer: you should use the Imgur API! — julienc, Jun 14 '14 at 20:26

Why does this image-download-script not work?

2 Answers2