0

I am trying to scrape data from a website and facing character encoding issues when writing to a file.

Here is the code.

from bs4 import BeautifulSoup
import requests


new_egg = requests.get('https://www.newegg.com/global/in-en/p/pl?d=graphic+card').text
soup = BeautifulSoup(new_egg, 'lxml')
card = soup.find_all('div', class_='item-container')

for index, cards in enumerate(card):
    name = cards.find_all('a', class_='item-title')
    product_name = name[0].text

    brand = cards.div.div.a.img["title"]

    price = cards.find('li', class_='price-current').text

    shipping = cards.find_all('li', class_='price-ship')
    shipping_cost = shipping[0].text
    with open(f'Directories/graphics/{index}.txt', 'w') as f:
        f.write(f'Product Name: {product_name} \n')
        f.write(f'Brand Name: {brand} \n')
        f.write(f'Price: {price} \n')
        f.write(f'Shipping Cost: {shipping_cost} \n')

I get the following error:

Traceback (most recent call last):
  File "O:\Python projects\test.py", line 22, in <module>
    f.write(f'Price: {price} \n')
  File "C:\Users\Aspire3\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u20b9' in position 7: character maps to <undefined>

Process finished with exit code 1
tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Open your textfile with `encoding="UTF-8"`. The Windows default is cp1252 which is not what you want, because BeautifulSoup returns UTF-8. – BoarGules Jul 02 '21 at 13:25
  • Use this ```with open(f'Directories/graphics/{index}.txt', 'w', encoding="utf-8") as f:``` – Ram Jul 02 '21 at 13:30
  • Thank you Ram for the solution but when I run the program, I get another error as 'f' not defined. – Prajwal Shrestha Jul 04 '21 at 00:26

0 Answers0