0

This is my code:

class LangList(SGMLParser):
    is_span = ""
    langs = []
    def start_span(self, attrs):
        for key, value in attrs:
            if key == 'class' and value == 'lang':
                self.is_span = 1
    def end_span(self):
        self.is_span = ""
    def handle_data(self, text):
        if self.is_span:
            self.langs.append(text)

...

for key in my_repositories.repositories.keys():
    print key

    each_repository_content = urllib2.urlopen(my_repositories.repositories[key]).read()

    my_repository = LangList()
    my_repository.feed(each_repository_content)

    print my_repository.langs

This is result:

forensic_tools
['Python']
google
['Python', 'Python']
ListServices
['Python', 'Python', 'Java', 'Perl']
win32-assembly-projects
['Python', 'Python', 'Java', 'Perl', 'C']
...

I am coding a application that get information of repositories from github member.

When I output array, I find array hasn't been initial and exists repeat element. How do I solve this problem?

Burger King
  • 2,945
  • 3
  • 20
  • 45
  • 2
    I'm guessing by "array hasn't been initial", you mean "when I create a new instance of the class, `langs` retains values that were added by previous instances". If you want `langs` to be new every time, create it inside the `__init__` method of the class instead of creating it directly under the `class` line. – Kevin Feb 18 '15 at 20:00

1 Answers1

2

Your langs is a class variable, not an instance variable, so it's linked to the class definition (and thus shared everywhere), not any particular instance of the class.

You probably want something more like:

class LangsList(sgmllib.SGMLParser):
    def __init__(self, *args, **kwargs):
        super(LangsList, self).__init__(*args, **kwargs)
        self.is_span = ""
        self.langs = []

where you're creating the instance variables upon initialization.

Nick T
  • 25,754
  • 12
  • 83
  • 121