Beautiful soup returns none even when there is an element

Question

Trying to filter the products name list using the header tags, but it always returns none.

source : https://www.tendercuts.in/chicken

code :

import requests
from bs4 import BeautifulSoup
def ExtractData(url):
response = requests.get(url=url).content
soup = BeautifulSoup(response, 'lxml')
header = soup.find("mat-card-header", {"class": "mat-card-header ng-tns- c9-188"})
print(header)
ExtractData(url="https://www.tendercuts.in/chicken")

Can't find class="mat-card-header ng-tns-c9-188" in source, do you mean ng-tns-c9-18? — CodeMonkey, Feb 11 '22 at 20:55
Do you just want to extract a single product or all the items for collection of mat-card-header elements? — CodeMonkey, Feb 11 '22 at 21:01
Does this answer your question? [Web-scraping JavaScript page with Python](https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python) — Grismar, Feb 11 '22 at 21:04

score 2 · Answer 1 · answered Feb 11 '22 at 21:13

Here's code to iterate all the <mat-card-header> items showing the class id and the associated text of the card-title. You can further filter on the child elements in each of header items to find particular products.

soup = BeautifulSoup(response, 'lxml')
headers = soup.find_all("mat-card-header")
for header in headers:
   print(header.get('class'), header.find('mat-card-title').text)

Output:

['mat-card-header', 'ng-tns-c9-3'] Chicken Curry Cut (Skin Off)
['mat-card-header', 'ng-tns-c9-3'] Chicken Curry Cut (Skin Off)
...
['mat-card-header', 'ng-tns-c9-19'] Chicken Wings

HedgeHog · Accepted Answer · 2022-02-12T08:33:05.040

What happens?

You try to find your tags by class that do not exist in your soup, cause it is generated dynamically and/or is caused by typo.

How to fix?

Select your elements more specific by tag or id and avoid classes cause these are more often created dynamically:

[t.text for t in soup.find_all('mat-card-title')]

To avoid the duplicates just use set() on result:

set([t.text for t in soup.find_all('mat-card-title')])

Example

import requests
from bs4 import BeautifulSoup

URL = 'https://www.tendercuts.in/chicken'
r = requests.get(URL)
soup = BeautifulSoup(r.text)

print(set([t.text for t in soup.find_all('mat-card-title')]))

Output

{'Chicken Biryani Cut - Skin On','Chicken Biryani Cut - Skinless','Chicken Boneless (Cubes)','Chicken Breast Boneless','Chicken Curry Cut (Skin Off)','Chicken Curry Cut (Skin On)','Chicken Drumsticks',     'Chicken Liver','Chicken Lollipop','Chicken Thigh & Leg (Boneless)','Chicken Whole Leg','Chicken Wings','Country Chicken','Minced Chicken','Premium Chicken-Strips (Boneless)','Premium Chicken-Supreme (Boneless)','Smoky Country Chicken (Turmeric)'}

EDIT

To get title, prices, ... I would recommend to iterate the mat-cards in following way.

import requests,re
from bs4 import BeautifulSoup

URL = 'https://www.tendercuts.in/chicken'
r = requests.get(URL)
soup = BeautifulSoup(r.text)

data = []
for item in soup.select('mat-card:has(mat-card-title)')[::2]:
    data.append({
        'title':item.find('mat-card-title').text,
        'price':re.search(r'₹\d*',soup.find('p', class_='current-price').text).group(),
        'weight':w if (w:=item.select_one('.weight span span:last-of-type').next_sibling) else None
    })

print(data)

Output

[{'title': 'Chicken Curry Cut (Skin Off)', 'price': '₹99', 'weight': 'Customizable'}, {'title': 'Chicken Curry Cut (Skin On)', 'price': '₹99', 'weight': 'Customizable'}, {'title': 'Country Chicken', 'price': '₹99', 'weight': 'Customizable'}, {'title': 'Premium Chicken-Supreme (Boneless)', 'price': '₹99', 'weight': ' 330 - 350 Gms'}, {'title': 'Chicken Boneless (Cubes)', 'price': '₹99', 'weight': ' 480 - 500 Gms'}, {'title': 'Chicken Drumsticks', 'price': '₹99', 'weight': ' 280 - 360 Gms'}, {'title': 'Chicken Biryani Cut - Skin On', 'price': '₹99', 'weight': ' 480 - 500 Gms'}, {'title': 'Chicken Thigh & Leg (Boneless)', 'price': '₹99', 'weight': ' 480 - 500 Gms'}, {'title': 'Chicken Biryani Cut - Skinless', 'price': '₹99', 'weight': ' 480 - 500 Gms'}, {'title': 'Minced Chicken', 'price': '₹99', 'weight': ' 480 - 500 Gms'}, {'title': 'Smoky Country Chicken (Turmeric)', 'price': '₹99', 'weight': ' 650 - 800 Gms'}, {'title': 'Chicken Lollipop', 'price': '₹99', 'weight': ' 280 - 300 Gms'}, {'title': 'Chicken Whole Leg', 'price': '₹99', 'weight': ' 370 - 390 Gms'}, {'title': 'Chicken Breast Boneless', 'price': '₹99', 'weight': ' 240 - 280 Gms'}, {'title': 'Premium Chicken-Strips (Boneless)', 'price': '₹99', 'weight': ' 330 - 350 Gms'}, {'title': 'Chicken Liver', 'price': '₹99', 'weight': ' 190 - 210 Gms'}, {'title': 'Chicken Wings', 'price': '₹99', 'weight': ' 480 - 500 Gms'}]

can you tell me where is the print function to print the results? — showstopper, Feb 11 '22 at 22:12
Finally got it, so i have to look it up as .text to filter all i need, is it same for the values if i need to filter the prices of the same products ? and thank you so much for the help. — showstopper, Feb 11 '22 at 22:31
THats great, but the prices i want is the current prices which is the
tag, the one which is in the output is from the tag, anyway to change that ? — showstopper, Feb 11 '22 at 23:53

score -1 · Answer 3 · answered Feb 11 '22 at 20:55

This is the most common problem with web scraping: most websites use JavaScript to change or add to the content on the page after loading the initial page. Whatever the JavaScript is supposed to change or load isn't on the page after the initial request.

The same is true for your code. If you look at the actual HTML (not in a browser, in your code), you'll find that it has many fields that angular.js code will be filling in later.

You'll need to load your page using a package like selenium, which uses a browser driver to load the page, execute the JavaScript and make the result available to you. (it does a lot more, like allowing you to navigate the site by clicking it, filling out fields, etc.)

selenium is a complex library with many options, but you can get started with:

pip install selenium

And by downloading a browser driver like Gecko Driver or ChromeDriver

And then something like this will work:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service  # for Chrome
from selenium.webdriver.firefox.service import Service  # for Firefox

service = Service('/path/to/driver')

service.start()

driver = webdriver.Remote(service.service_url)

driver.get('https://www.tendercuts.in/chicken');

# do something with what driver loaded here

driver.quit()

You could just bs4 your way through driver.page_source, but since you now have selenium anyway, you could also look into the ways selenium allows you to find and select elements, like using the built-in XPath functions.

Can whoever voted down the answer please provide a comment on what is wrong with it? — Grismar, Feb 11 '22 at 20:58
Not the voter, but this answer seems like a rehash of a [canonical thread](https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python/) that it'd be best just to link to. — ggorlen, Feb 11 '22 at 20:59
The same information is certainly in there, thanks - will vote to close, because that has all bases covered. — Grismar, Feb 11 '22 at 21:03
Thanks, although it might not apply to OP's case here. It seems their data is in the static markup and it's just a typo on the class. — ggorlen, Feb 11 '22 at 21:20

Beautiful soup returns none even when there is an element

3 Answers3

What happens?

How to fix?

Example

Output

EDIT

Output