1

I tried to scrape some data from an e-commerce site and I needed the discount percentage of products which were in a span tag inside a div tag having a class name " VGWI6T" But it also gave me those products discount percentage with class name " VGWI6T _2YXy_Y".

<div>
.......
.......
......
<div class= "VGWI6T">

  <span>25% off</span>

</div>
.....
.....
.....
</div>

.........
...........
......

<div>
....
....
....
<div class= "VGWI6T _2YXy_Y">

  <span>25% off</span>
</div>
....
.....
</div>

How can I grab ONLY those products with the former class name(VGWI6T)? When I am doing:

Discount = bs.find_all('div',class_='VGWI6T', attars= 'span')

it is giving me all the discounts of products even if they belong to the VGWI6T _2YXy_Y class.

1 Answers1

0

Use css selector and class not contains _2YXy_Y

from bs4 import BeautifulSoup
html='''<div>
.......
.......
......
<div class= "VGWI6T">

  <span>25% off</span>

</div>
.....
.....
.....
</div>

.........
...........
......

<div>
....
....
....
<div class= "VGWI6T _2YXy_Y">

  <span>25% off</span>
</div>
....
.....
</div>'''

soup=BeautifulSoup(html,"html.parser")
for item in soup.select(".VGWI6T:not(._2YXy_Y) span "):
    print(item.text)
KunduK
  • 32,888
  • 5
  • 17
  • 41