I am trying to scrape the price information of an Amazon Page using beautiful soup.
The code was written on macOS Catalina (Version 10.15.5) and the web browser used was google chrome Version 84.0.4147.135 (Official Build) (64-bit). Python Version 3.8.2.
As you can see the output (price) on the last line from the code below.
Is there a way to remove the unwanted characters from the output or improve my code so the final output (price) reflects just ₹1,700.00?
The unwanted characters are " \xa0 "
Also, is there an explanation for these characters as to what do they mean and why do they appear as part of the output. Thanks.
Please refer to the code below:
import bs4
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36'}
res = requests.get('https://www.amazon.in/Automate-Boring-Python-Albert-Sweigart/dp/1593275994', headers=headers)
res.raise_for_status()
soup = bs4.BeautifulSoup(res.text)
soup.select('#soldByThirdParty > span')
[₹ 1,700.00]
elems = soup.select('#soldByThirdParty > span')
elems[0].text
'₹\xa01,700.00'