0

I am trying to scrape images from the home page of my school's website. But the catch is that these images are present in a slideshow. There are 9 images in total in the slideshow and I want to scrape all of them. I have no clue how I have to go about it, because all I can do with requests is to scrape the first image of it. And I don't want to use selenium because it is very slow. How can I scrape all images from the slideshow just by using requests and BeautifulSoup? Any help would be appreciated. Thanks!

P.S: I know that I have to provide the code that I have written so far, but the problem is that I haven't tried out anything so far as I really have no idea how to go about it, because this is the first time I am scraping images from a slideshow, so please forgive me. Plus, I already went through the answers for this question, but I didn't know how to implement the answer in my program.

Sushil
  • 5,440
  • 1
  • 8
  • 26

2 Answers2

1

To get the 9 images in the slider, you can use this example:

import requests
from bs4 import BeautifulSoup


url = 'http://www.buddingmindsinternationalschool.com/'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

for img in soup.select('#rev_slider_2_1 img'):
    print(img['src'])

Prints:

http://www.buddingmindsinternationalschool.com/wp-content/uploads/2017/11/banner-4.jpg
http://www.buddingmindsinternationalschool.com/wp-content/uploads/2017/11/banner-2.jpg
http://www.buddingmindsinternationalschool.com/wp-content/uploads/2017/11/banner-7.jpg
http://www.buddingmindsinternationalschool.com/wp-content/uploads/2017/10/bannerfour.jpg
http://www.buddingmindsinternationalschool.com/wp-content/uploads/2017/11/bannerseven.jpg
http://www.buddingmindsinternationalschool.com/wp-content/uploads/2017/11/bannerfives-1.jpg
http://www.buddingmindsinternationalschool.com/wp-content/uploads/2017/11/banner-5.jpg
http://www.buddingmindsinternationalschool.com/wp-content/uploads/2017/11/bannereight.jpg
http://www.buddingmindsinternationalschool.com/wp-content/uploads/2017/11/banner-6.jpg
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
1

Try this:

import requests
from bs4 import BeautifulSoup

url = "http://www.buddingmindsinternationalschool.com/"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

for tag in soup.select(".rev-slidebg"):
    img_name = tag['src'].split('/')[-1]

    with open(img_name, "wb") as f:
        req = requests.get(tag['src'])
        f.write(req.content)
MendelG
  • 14,885
  • 4
  • 25
  • 52