I'm trying to parse HTML page that I saved to my computer(windows 10)
from bs4 import BeautifulSoup
with open("res/JLPT N5 vocab list.html", "r", encoding="utf8") as f:
soup = BeautifulSoup(f, "html.parser")
tables = soup.find_all("table")
sectable= tables[1]
for tr in sectable.contents[1:]:
if tr.name == "tr":
try:
print(tr.td.a.get_text())
except(AttributeError):
continue
It should print all of japanese words in first column but error was raised at print(tr.td.a.get_text())
said UnicodeEncodeError: 'charmap" codec can't encode character in position 0-1: character maps to (undefined)
so, how can I solve this error?