while using python web-scraping faced error

Question

I want to compare the price of coconut on two websites. there are two stores (websites) called laughs and glomark.

Now,I have two files main.py and comparison.py. I think the problem is in the Laughs price scrapping part. This cord is running without error. I will put my output and expected output bellow after the code.

main.py

from compare_prices import compare_prices 
laughs_coconut = 'https://scrape-sm1.github.io/site1/COCONUT%20market1super.html'
glomark_coconut = 'https://glomark.lk/coconut/p/11624'
compare_prices(laughs_coconut,glomark_coconut)

comparison.py

import requests
import json
from bs4 import BeautifulSoup

#Imitate the Mozilla browser.
user_agent = {'User-agent': 'Mozilla/5.0'}

def compare_prices(laughs_coconut,glomark_coconut):
    # Aquire the web pages which contain product Price
    laughs_coconut = requests.get(laughs_coconut)
    glomark_coconut = requests.get(glomark_coconut)

    # LaughsSuper supermarket website provides the price in a span text.
    soup_laughs = BeautifulSoup(laughs_coconut.text, 'html.parser')
    price_laughs = soup_laughs.find('span',{'class': 'price'}).text
    
    
    # Glomark supermarket website provides the data in jason format in an inline script.
    soup_glomark = BeautifulSoup(glomark_coconut.text, 'html.parser')
    script_glomark = soup_glomark.find('script', {'type': 'application/ld+json'}).text
    data_glomark = json.loads(script_glomark)
    price_glomark = data_glomark['offers'][0]['price']

    
    #TODO: Parse the values as floats, and print them.
    price_laughs = price_laughs.replace("Rs.","")
    price_laughs = float(price_laughs)
    price_glomark = float(price_glomark)
    print('Laughs   COCONUT - Item#mr-2058 Rs.: ', price_laughs)
    print('Glomark  Coconut Rs.: ', price_glomark)
    
    # Compare the prices and print the result
    if price_laughs > price_glomark:
        print('Glomark is cheaper Rs.:', price_laughs - price_glomark)
    elif price_laughs < price_glomark:
        print('Laughs is cheaper Rs.:', price_glomark - price_laughs)    
    else:
        print('Price is the same')

My code is running without error and as an output, it shows.

Laughs   COCONUT - Item#mr-2058 Rs.:  0.0

Glomark  Coconut Rs.:  110.0

Laughs is cheaper Rs.: 110.0

but the expected output is:

Laughs   COCONUT - Item#mr-2058 Rs.:  95.0

Glomark  Coconut Rs.:  110.0

Laughs is cheaper Rs.: 15.0

note:- <span class="price">Rs.95.00</span> this is the element of Laughs coconut price

If you're creating a [MRE] that is _minimal_, none of the code related to the glomark coconut is relevant to your problem, which is that you are unable to parse the price of the laughs coconut. You should remove all irrelevant code from your question. — Pranav Hosangadi, Jan 20 '23 at 18:58

score 0 · Accepted Answer · answered Jan 20 '23 at 17:08

0

Because there are two items with 'span',{'class': 'price'} . Since find() method returns first value, in this case we will use findAll() method and return second one. So in your code if you change to this price_laughs = soup_laughs.findAll('span',{'class': 'price'})[1].text problem will be solved.

answered Jan 20 '23 at 17:08

Polatkan Polat

576
7

3

In newer code avoid old syntax `findAll()` instead use `find_all()` or `select()` with `css selectors` - For more take a minute to [check docs](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#method-names) – HedgeHog Jan 20 '23 at 18:44

HedgeHog · Answer 2 · 2023-01-20T18:49:41.623

Try to change your strategy selecting the element - There is an id to select elements container more specific. You could use css selectors for example

price_laughs = soup.select_one('[id^="product-price"] .price').text

Concerning the other website you could also use its api to get the price:

 requests.get('https://glomark.lk/product-page/variation-detail/11624', headers={'x-requested-with': 'XMLHttpRequest'}).json()['price']

while using python web-scraping faced error

2 Answers2