1

I want to scrape the name and price from this website:

https://www.flipkart.com/laptops/~buyback-guarantee-on-laptops-/pr?sid=6bo%2Cb5g&uniqBStoreParam1=val1&wid=11.productCard.PMU_V2

Both name and price are within div tags.

Name:

enter image description here

Price

enter image description here

Printing name works fine, but printing Price gives me an error:

Traceback (most recent call last):
  File "c:\File.py", line 37, in <module>
    print(price.text)
  File "C:\Python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u20b9' in position 0: character maps to <undefined>

Code:

from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import requests

response = requests.get("https://www.flipkart.com/laptops/~buyback-guarantee-on-laptops-/pr?sid=6bo%2Cb5g&uniq")
soup = BeautifulSoup(response.text, 'html.parser')
for a in soup.findAll('a',href=True, attrs={'class':'_31qSD5'}):
    name=a.find('div', attrs={'class':'_3wU53n'})
    price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
    print(name.text)

enter image description here

What is the difference between those?

So why one of them give me an error and the other one is not?

Serdia
  • 4,242
  • 22
  • 86
  • 159

1 Answers1

1

It is yielding that error because python is having trouble with that currency sign. The Indian rupee sign is interpreted differently depending on the language and is not in the python charmap by default. If we change your last print statement to print(str(price.text.encode("utf-8"))) we will get results that look like this:

b'\xe2\x82\xb961,990' b'\xe2\x82\xb940,000' b'\xe2\x82\xb963,854' b'\xe2\x82\xb934,990' b'\xe2\x82\xb948,990' b'\xe2\x82\xb952,990' b'\xe2\x82\xb932,990' b'\xe2\x82\xb954,990' b'\xe2\x82\xb952,990'

Since this output isn't very pretty and probably isn't usable, I would personally truncate that symbol before printing. If you really want python to print the Indian rupee symbol, you can add it to your charmap. Follow this steps from this post to add customizations to the charmap.

Joseph Rajchwald
  • 487
  • 5
  • 13