1

In my Django project I use BeautifulSoup for web scraping.It works ut I can't print or slice it. When I try it give the error: (I'm doing this on the views.py)

"UnicodeEncodeError 'charmap' codec can't encode character '\u200e' in position 59: character maps to <undefined "

. How can I print x variable?

    URL = link
    user_agent = getRandomUserAgent()
    headers = {"User-Agent": user_agent}

    page = requests.get(URL, headers=headers)
    soup = BeautifulSoup(page.content, 'html.parser')



    mylist = soup.find_all("td", class_="a-size-base prodDetAttrValue")

    for x in mylist:
         print(x)
  • Can you add the full traceback? Which line is triggering the error? – Iain Shelvington Jul 01 '22 at 23:48
  • This is a duplicate question. Have a look at [this SO](https://stackoverflow.com/questions/27092833/unicodeencodeerror-charmap-codec-cant-encode-characters). – Jeyfel Brandauer Jul 02 '22 at 00:06
  • Does this answer your question? [UnicodeEncodeError: 'charmap' codec can't encode characters](https://stackoverflow.com/questions/27092833/unicodeencodeerror-charmap-codec-cant-encode-characters) – Prophet Dec 07 '22 at 08:19

1 Answers1

3

The desired data is not possible to pull by bs4 only because of dynamically loaded by JavaScript but grab using bs4 with selenium and It didn't throw UnicodeEncodeError

import time
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
table=driver.get('https://www.amazon.com/dp/B09N4ZL8NV')
driver.maximize_window()
time.sleep(3)
soup = BeautifulSoup(driver.page_source, 'html.parser')
mylist = soup.find_all("td", class_="a-size-base prodDetAttrValue")

for x in mylist:
    print(x.get_text(strip=True))

Output:

13.3 Inches
‎1920 x 1200 pixels
‎1920 x 1200 Pixels
‎2.8 GHz core_i7_family
‎16 GB LPDDR4X
‎2.8 GHz
‎512 GB Flash Memory Solid State
‎Intel Iris Xe Graphics
‎Intel
‎Iris Xe Graphics
‎Bluetooth, 802.11ax
‎Lenovo
‎ThinkPad X13 Yoga Gen 2
‎20W80056US
‎PC
‎Windows 11 Pro
‎2.65 pounds
‎8.4 x 12 x 0.61 inches
‎8.4 x 12 x 0.61 inches
‎Black
‎Intel
‎1
‎DDR4 SDRAM
‎512 GB
‎No
B09N4ZL8NV
December 6, 2021
 
Md. Fazlul Hoque
  • 15,806
  • 5
  • 12
  • 32
  • I write list because I ask here. And the Indentation is correct in my actual code. I edited my wrong code indentation. The error is from somewhere else – ahmet yılmaz234 Jul 01 '22 at 23:47
  • Would you share the url? – Md. Fazlul Hoque Jul 01 '22 at 23:54
  • [https://www.amazon.com/Lenovo-ThinkPad-20W80056US-Touchscreen-Convertible/dp/B09N4ZL8NV/ref=sr_1_3?crid=JGBXK1SFMJ0C&keywords=lenovo+laptop&qid=1656707364&sprefix=lenovo+lapto%2Caps%2C214&sr=8-3] here but its not about that I think. – ahmet yılmaz234 Jul 02 '22 at 00:13