i try this code in Python 3.8.2:
from bs4 import BeautifulSoup
import urllib.request
html = urllib.request.urlopen(
'https://vietnamnet.vn/').read()
soup = BeautifulSoup(html, "html.parser").encode("utf-8")
print(soup.title)
but i received:
instead of expected: <title>Báo VietNamNet - Tin tức online, tin nhanh Việt Nam và thế giới</title>
what am i doing wrong and how can i fix it?
I have to use .encode("utf-8")
because html string contains unicode character. Does it effect the soup?
Thanks!