30

I am new to Python. I have tried to ran this code but I am getting an error message for ImportError: No module named 'HTMLParser'. I am using Python 3.x. Any reason why this is not working ?

#Import the HTMLParser model
from HTMLParser import HTMLParser

#Create a subclass and override the handler methods
class MyHTMLParser(HTMLParser):

#Function to handle the processing of HTML comments
    def handle_comment(self,data):
        print ("Encountered comment: ", data)
        pos = self.getpos()
        print ("At line: ", pos[0], "position ", pos[1])

def main():
    #Instantiate the parser and feed it some html
    parser= MyHTMLParser()

    #Open the sample file and read it
    f = open("myhtml.html")
    if f.mode== "r":
        contents= f.read()  #read the entire FileExistsError
        parser.feed()


if __name__== "__main__":
    main()

I am getting the following error:

Traceback (most recent call last):
  File "C:\Users\bm250199\workspace\test\htmlparsing.py", line 3, in <module>
    from HTMLParser import HTMLParser
ImportError: No module named 'HTMLParser'
user2625433
  • 311
  • 2
  • 5
  • 6
  • https://docs.python.org/2/library/htmlparser.html - "The HTMLParser module has been renamed to html.parser in Python 3" – Keith Hall Jan 06 '16 at 10:22

1 Answers1

66

The module is called html.parser in Python 3. So you need to change your import to reflect that new name:

from html.parser import HTMLParser

You should always check the standard library documentation to make sure that you are importing the right things from the right location.

poke
  • 369,085
  • 72
  • 557
  • 602
  • Thanks, it seems that it overcome that issue. however, I am getting the following message now: – user2625433 Jan 06 '16 at 10:31
  • Traceback (most recent call last): File "C:\Users\bm250199\workspace\test\htmlparsing.py", line 26, in main() File "C:\Users\bm250199\workspace\test\htmlparsing.py", line 22, in main parser.feed() TypeError: feed() missing 1 required positional argument: 'data' – user2625433 Jan 06 '16 at 10:32
  • Again, [check the documentation](https://docs.python.org/3/library/html.parser.html#html.parser.HTMLParser.feed) and read the error message: You need to pass data to `parser.feed()`. – poke Jan 06 '16 at 10:36
  • Working fine when adding data to the parser.feed. Thanks – user2625433 Jan 06 '16 at 10:45
  • 1
    Update as of 3.5, you will get a deprecation warning using this: `The unescape method is deprecated and will be removed in 3.5, use html.unescape() instead.` Now it's just `from html import unescape` and `escaped_text = unescape(original_text)` – Brendan Jul 19 '18 at 04:10
  • @Brendan Are you sure you commented on the right question? This is about the HTML parser, and not about an unescape function. – poke Jul 19 '18 at 06:26
  • @ Ha yes, I totally commented on the wrong question. Whoops! – Brendan Jul 20 '18 at 22:03
  • @Brendan's comment helped me though. ;) – deltaray Sep 10 '21 at 17:16