2

Possible Duplicate:
How to download image using requests

I have this Python script for scraping image URLs of a tumblr blog, and would like to download them to a local folder on my desktop. How would I go about implementing this

import requests 
from bs4 import BeautifulSoup 

def make_soup(url):
#downloads a page with requests and creates a beautifulsoup object

    raw_page = requests.get(url).text
    soup = BeautifulSoup(raw_page)

    return soup


def get_images(soup):
#pulls images from the current page

    images = []

    foundimages = soup.find_all('img')

    for image in foundimages:
        url = img['src']

        if 'media.tumblr.com' in url:
            images.append(url)


    return images


def scrape_blog(url):
# scrapes the entire blog

    soup = make_soup(url)

    next_page = soup.find('a' id = 'nextpage')

    while next_page is not none:

        soup = make_soup(url + next_page['href'])
        next_page = soup.find('a' id = 'nextpage')

        more_images = get_images(soup)
        images.extend(more_images)

    return images


url = 'http://x.tumblr.com'
images = scrape_blog(url)
Community
  • 1
  • 1
Lucifer N.
  • 966
  • 3
  • 15
  • 34

1 Answers1

1

Python's "urllib2" is probably what you're looking for. If you need to do anything complicated (such as with cookies or authentication) it may be worth looking into a wrapper library such as Requests, which provides a nice wrapper around a lot of the more cumbersome features of the standard library.

Ryan
  • 216
  • 1
  • 7