0

I have the following code in VSCode:

from bs4 import BeautifulSoup
import urllib.request

req = urllib.request.urlopen('https://www.ua-football.com/sport')
html = req.read()

soup = BeautifulSoup(html, 'html.parser')
news = soup.find_all('li', class_='liga-news-item')

When printing news it gives me an error:

UnicodeEncodeError: 'charmap' codec can't encode characters in position 726-730: character maps to <undefined>

If I try to encode html or soup to utf-8 I will be unable to use find_all method. I literally went across the whole internet but haven't found a solution. Is there a way?

Ivan
  • 1
  • 1
  • `news` is a string, yes? If you can't `print` it, because of which characters it contains, then that is a problem with your terminal - not the code. I know it is not `bytes` because that would just print with `\x` escapes for every byte value outside of ASCII printables. I can also tell because it is an *en*code error - meaning that Python is (behind the scenes, in order to talk to your terminal) converting *from* string *to* bytes. First check if you can write the string to a file and open it in a text editor. – Karl Knechtel May 16 '22 at 23:01
  • "If I try to encode html or soup to utf-8 I will be unable to use find_all method." Then encode the thing that you actually need to encode, i.e. `news`. – Karl Knechtel May 16 '22 at 23:03
  • "I literally went across the whole internet but haven't found a solution." I can easily find the linked duplicate by putting `[python] charmap print` [into the site search](https://stackoverflow.com/search?q=%5Bpython%5D+charmap+print), or [similarly into a search engine](https://duckduckgo.com/?q=python+charmap+print). I assume you found similar things in your [research](https://meta.stackoverflow.com/questions/261592); if they don't solve the problem, then we can't possibly help unless you explain *why not*. – Karl Knechtel May 16 '22 at 23:06
  • 1
    @KarlKnechtel hi, I've got it working. I decided to install PyCharm just to try the same code there and it works when I run it in "runner". Then I got back to VSCode and tried to run in the terminal and it works as well. However, for some reason it's not working when running in "output" in VSCode. If I encode `news` then it returns empty array. By default `news` is a class. – Ivan May 17 '22 at 00:53

0 Answers0