0

I'm learning python by myself and just for fun I'm trying to write an XML parser from scratch. My code is now at this point and I don't know why this line copies the attribute to the whole list's «Objetos».

parse[-1].attributes.append(attribute)

Using PyCharm's debugging tool I've checked this specific line is the problem, copying the attribute string to the whole list already parsed. Can you help me?

capture = False
capture_attribute = False
string = ''
content = ''
attribute = ''
parse = []  # List of content


class Objecto:

    def __init__(self, name, txt = None, attribute = [], content = [] ):
        self.name = name
        self.txt = txt
        self.attributes = attribute
        self.content = content


with open('test.xml') as f:
    for line in f.readlines():
        for i, character in enumerate(line):
            # TAGs
            if capture is True:
                string += character
            if character == '<':
                capture = True
                capture_content = False
            if character == '>':
                string = string[:-1]
                capture = False
            if capture is True and character == ' ':
                string = string[:-1]
                capture = False
                capture_attribute = True
            if capture is False and string != '':
                parse.append(Objecto(string))
                string = ''

            # TEXT CONTENT
            if 0 < i < len(line) - 1:
                if line[i - 1] == '>' or line[i + 1] == '<':
                    if line[i] != '<':
                        capture_content = True
            if capture_content is True:
                content += character
            if capture_content is False:
                if content != '':
                    parse[-1].txt = content
                    content = ''

            # ATTRIBUTES
            if capture_attribute is True:
                if line[i+1] == ' ' or line[i+1] == '>':
                    attribute += character
                    parse[-1].attributes.append(attribute)
                    capture_attribute = False
                    attribute = ''
                elif character != ' ':
                    attribute += character```
SMan
  • 1
  • 2
  • The issue is most likely related to your default arguments ``attribute = [], content = []`` - there's probably a duplicate issue somewhere on SO that explains in detail, why most of the time it's a bad idea to use mutable default args. Found it: https://stackoverflow.com/questions/1132941/least-astonishment-and-the-mutable-default-argument – Mike Scotty Feb 24 '23 at 13:14
  • Absolutely. Many thanks for your answer. I'll study actually why this happens. – SMan Feb 24 '23 at 13:20

0 Answers0