I am using Python to retrieve HTML from a webpage and then parsing it in the MyHtmlParer class. If I find certain data in the HTML, I want to add it to links[] and return it to the main method.
import urllib2
from MyHtmlParser import MyHtmlParser
def HtmlRetrieve(url):
req = urllib2.Request(url, headers={'User-Agent': "Magic Browser"})
con = urllib2.urlopen(req)
return con.read()
def main():
url = "someUrl.com"
html = HtmlRetrieve(url)
parser = MyHtmlParser()
parser.feed(html)
print parser.links
main()
Then this is my MyHtmlParser Class
from HTMLParser import HTMLParser
class MyHtmlParser(HTMLParser):
def __init__(self):
HTMLParser.__init__(self)
self.links = []
def handle_data(self, data):
if data == "some text":
self.links.append(data)
The above code adds the data to self.links but in my main method parser.links does not have any data. What do I need to do to get the data from MyHtmlParser() to my main() method?