1

I can't seem to add access any new variables in HTMLParser. I'm following the examples I've seen here. I don't get any errors adding a variable inside __init__, but when I try to access it in a method I'm told it doesn't exist.

#!/usr/bin/env python
from HTMLParser import HTMLParser
import urllib

class parse(HTMLParser):

    def __init__(self, data):
        HTMLParser.__init__(self)
        self.feed(data)
        self.foo = 'err'

    def handle_starttag(self, tag, attrs):
        print self.foo
        if tag == 'a':
            for attr, value in attrs:
                if attr == 'href':
                    print value[10:]
                    continue

    def handle_data(self, text):
        pass

    def handle_endtag(self, tag):
        pass


page = urllib.urlopen('http://docs.python.org/library/htmlparser.html').read()
p = parse(page)

here's the output:

Traceback (most recent call last):
  File "./doit.py", line 34, in <module>
    p = parse(page)
  File "./doit.py", line 9, in __init__
    self.feed(data)
  File "/usr/lib/python2.6/HTMLParser.py", line 108, in feed
    self.goahead(0)
  File "/usr/lib/python2.6/HTMLParser.py", line 148, in goahead
    k = self.parse_starttag(i)
  File "/usr/lib/python2.6/HTMLParser.py", line 271, in parse_starttag
    self.handle_starttag(tag, attrs)
  File "./doit.py", line 14, in handle_starttag
    print self.foo
AttributeError: parse instance has no attribute 'foo'

thanks for your help

Community
  • 1
  • 1
Nona Urbiz
  • 4,873
  • 16
  • 57
  • 84

2 Answers2

2

You just have to swap the two lines

self.feed(data)
self.foo = 'err'

Calling .feed() implicitly calls .handle_starttag(), but this is done before the creation of the attribute in your code.

Probably an even better idea would be not to feed data to the constructor at all, but rather call .feed() explicitly.

Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
1
self.handle_starttag(tag, attrs)

is being called in HTMLParser.py before

self.foo = 'err'

has been set in your code.

Try:

self.foo = 'err'
self.feed(data)
William
  • 1,007
  • 7
  • 11