Essentially the script will download images from wallbase.cc's random and toplist pages. Essentially it looks for a 7 digit string which identifies each image as that image. It the inputs that id into a url and downloads it. The only problem I seem to have is isolating the 7 digit string.
What I want to be able to do is..
Search for <div id="thumbxxxxxxx"
and then assign xxxxxxx
to a variable.
Here's what I have so far.
import urllib
import os
import sys
import re
#Written in Python 2.7 with LightTable
def get_id():
import urllib.request
req = urllib.request.Request('http://wallbase.cc/'+initial_prompt)
response = urllib.request.urlopen(req)
the_page = response.read()
for "data-id="" in the_page
def toplist():
#We need to define how to find the images to download
#The idea is to go to http://wallbase.cc/x and to take all of strings containing <a href="http://wallbase.cc/wallpaper/xxxxxxx" </a>
#And to request the image file from that URL.
#Then the file will be put in a user defined directory
image_id = raw_input("Enter the seven digit identifier for the image to be downloaded to "+ directory+ "...\n>>> ")
f = open(directory+image_id+ '.jpg','wb')
f.write(urllib.urlopen('http://wallpapers.wallbase.cc/rozne/wallpaper-'+image_id+'.jpg').read())
f.close()
directory = raw_input("Enter the directory in which the images will be downloaded.\n>>> ")
initial_prompt = input("What do you want to download from?\n\t1: Toplist\n\t2: Random\n>>> ")
if initial_prompt == 1:
urlid = 'toplist'
toplist()
elif initial_prompt == 2:
urlid = 'random'
random()
Any/all help is very much appreciated :)