0

I am building a Beautiful Soup script that scrapes a news website. One of the strings that is scraped is an image url (img_src). Is there a way to download the images that are scraped?

I am using Python 2.7.16. Here is the script:

    # -*- coding: utf-8 -*-

from bs4 import BeautifulSoup
import requests
import time
import csv

source = requests.get('https://jornalnoticias.co.mz/index.php/desporto').text

soup = BeautifulSoup(source, 'lxml')

#prepare csv file
csv_file = open('jornalnoticias.csv', 'a')
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['headline', 'url_src', 'img_src', 'news_src', 'cat', 'epoch'])

news_src = 'Jornal Notícias'
cat = 'Desporto'
epoch = time.time()

#loop over articles
for article in soup.find_all('div' , itemprop='blogPost'):
    try:
        headline = article.h2.a.text.replace('\t','').encode('utf8')
    except Exception as e:
        headline = None
    try:
        url_src = 'https://jornalnoticias.co.mz' + article.find('a' , href =  True)['href']
    except Exception as e:
        url_src = None
    try:
        img_src = 'https://jornalnoticias.co.mz' + article.find('a' , class_ = 'hover-zoom')['href']
    except Exception as e:
        img_src = None

    print(headline)
    print(url_src)
    print(img_src)

    #write csv
    csv_writer.writerow([headline, url_src, img_src, news_src, cat, epoch])

csv_file.close()
krustov
  • 75
  • 6
  • Sorry, but no. My knowledge of Python is pretty basic. I got this far with Beautiful Soup through a tutorial. – krustov Apr 30 '20 at 13:12
  • Welcome to SO. This isn't a discussion forum or tutorial. Please take the [tour] and take the time to read [ask] and the other links found on that page. Invest some time with [the Tutorial](https://docs.python.org/3/tutorial/index.html) practicing the examples. It will give you an idea of the tools Python offers to help you solve your problem. – wwii Apr 30 '20 at 13:13
  • The answer to your question -`Is there a way to download the images that are scraped?` - is yes. You cannot do it with Beautifulsoup, you will have to use something else. I picked that Q&A link because there were a lot of answers. If you search with variations of your question `python download images from url site:stackoverflow.com` you wil find other Q&A's. – wwii Apr 30 '20 at 13:19
  • 1
    Also see https://stackoverflow.com/questions/13137817/how-to-download-image-using-requests, because `requests` is generally more pleasant to work with than `urllib`. – larsks Apr 30 '20 at 13:33

0 Answers0