0
from bs4 import BeautifulSoup

with open('website.html') as file:
    contents = file.read()

soup = BeautifulSoup(contents, "html.parser")
all_anchor_tags = soup.find_all(name="a")
for tag in all_anchor_tags:
    print(tag.get("href"))
heading = soup.find(name="h1")
print(heading)

Output

Traceback (most recent call last):
  File "C:\Users\asus\Desktop\repos\100_days_of_python\Day-45\main.py", line 4, in <module>
    contents = file.read()
               ^^^^^^^^^^^
  File "C:\Users\asus\AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 2366: character maps to <undefined>

Process finished with exit code 1

This code perfectly well when I worked on a few websites, but when I tried it on another one it gave me this error.

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91

1 Answers1

0

Try This,

with open('fb.html', encoding="utf-8") as file:
contents = file.read()

Its already answered here, check this for more info

Ajay Negi
  • 1
  • 1