-4

I wanted to call the function spider which is within a class with the parameters such as url,word and maxPages.

when I try calling it the following way I get an error because spider() gets more than 3 arguments (it gets 4 arguments instead).

Please can someone guide me as to how I can call the function which is within a class correctly.

My code looks like this:

    import HTMLParser
    from urllib2 import urlopen
    from pandas.io.parsers import TextParser 

    class LinkParser(HTMLParser.HTMLParser):
    #other methods

    def spider(url,word,maxPages):
        pagesTovisit = [url]
        numberVisited=0
        foundWord = False
        maxPages = 0
        while numberVisited < maxPages and pagesTovisit != [] and not foundWord:
            numberVisited = numberVisited +1


            url = pagesTovisit[0]
            pagesTovisit = pagesTovisit[1:]
            try:
                print numberVisited, "Visiting:", url
                parser = LinkParser()
                data, links = parser.getLinks(url)
                if data.find(word)>-1:
                    foundWord = True
                    pagesTovisit = pagesTovisit +links
                    print "Success"
            except:
                print "failed"
        if foundWord:
            print "the word",word,"was found at",url
        else:
            print "word not found"


    url = raw_input("enter the url: ")
    word = raw_input("enter the word to search for: ")
    maxPages = raw_input("the max pages you want to search in for are: ")

    lp=LinkParser()
    lp.spider(url,word,maxPages)
Coolfrog
  • 1
  • 5

2 Answers2

3

Your indentation in the post is all wrong but I assume spider is in the class. You need to add the self keyword as first argument to the function to make it a method:

class LinkParser(HTMLParser.HTMLParser):
    def spider(self,url,word,maxPages):
         ...

Inside your spider method there is a call to LinkParser.getLinks(). Instead of creating another instance of the class you should call the method by: self.getLinks(...) as this won't create new instances. Also class methods and members can be reached inside methods by writing:

def methodOfClass(self,additionalArguments):
    self.memberName
    self.methodName(methodArguments)
Gábor Fekete
  • 1,343
  • 8
  • 16
0

Ignoring the indentation errors which I believe are only copy-paste issues

Every method in Python implicitly recieves the instance it is called upon as the first argument, so its definition should count for that.

Change def spider(url, word, maxPages) to def spider(self, url, word, maxPages).

DeepSpace
  • 78,697
  • 11
  • 109
  • 154