I have watched a video that teaches how to use BeautifulSoup and requests to scrape a website Here's the code
from bs4 import BeautifulSoup as bs4
import requests
import pandas as pd
pages_to_scrape = 1
for i in range(1,pages_to_scrape+1):
url = ('http://books.toscrape.com/catalogue/page-{}.html').format(i)
pages.append(url)
for item in pages:
page = requests.get(item)
soup = bs4(page.text, 'html.parser')
#print(soup.prettify())
for j in soup.findAll('p', class_='price_color'):
price=j.getText()
print(price)
The code i working well. But as for the results I noticed weird character before the euro symbol and when checking the html source, I didn't find that character. Any ideas why this character appears? and how this be fixed .. is using replace enough or there is a better approach?