0

I try to scrape all bonus items of a supermarket. After inspecting the HTML code I found the name of each bonus in a span with class named "line-clamp_root__3yA0X line-clamp_active__2502b"

enter image description here

However, when I try to find this spand by class name I can't find it. Here is my code:

import requests
from bs4 import BeautifulSoup
    
url='https://www.ah.nl/bonus'
    
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
    
soup.find_all('span', {'class': 'line-clamp_root__3yA0X line-clamp_active__2502b'})

Output is [ ]

Does anyone have an idea what I am doing wrong?

Many thanks in advance!

Ps. My final goal is to scrape all bonus item names :)

Barry
  • 185
  • 1
  • 3
  • 10

1 Answers1

0

That class attribute has two classes in it. To select elements using two classes, you'll either need to match the exact value of the full attribute using _=:

soup.find_all('span', class_='line-clamp_root__3yA0X line-clamp_active__2502b')

Or you'll need to use a CSS selector:

soup.find_all('span.line-clamp_root__3yA0X.line-clamp_active__2502b')
Sean
  • 6,873
  • 4
  • 21
  • 46
  • Thanks for your quick response. I just tried both options, but neither seems to work in this case. Unfortunately both options outputs the same [ ] – Barry Nov 05 '21 at 12:55
  • Those elements are loaded by javascript after the initial page load. You may need to use a different scraping library: https://stackoverflow.com/questions/2148493/scrape-html-generated-by-javascript-with-python – Sean Nov 05 '21 at 12:59