5

I'm trying to do the following: After performing a regex group search, I'm trying to assign the results to the class properties by a specific order. the number of results from the regex search varies from 1-5 values.

class Classification():
    def __init__(self, Entry):
        self.Entry = Entry
        self.Section = ''
        self.Class = 'Null'
        self.Subclass = 'Null'
        self.Group = 'Null'
        self.Subgroup = 'Null'


    def ParseSymbol(self,regex):

        Properties_Pointers = [self.Section,self.Class,self.Subclass,self.Group,self.Subgroup]

        Pattern_groups = re.search(regex, self.Symbol)

        i = 0
        for group in Pattern_groups.groups():
            Properties_Pointers[i] = group
            i += 1

the problem is that for each loop iteration, instead of the class property, Properties_Pointers[i] gets the property's value (and of course in this case I can't assign the desired value to the property).

thanks.

Rgo
  • 87
  • 2
  • 6

3 Answers3

4

Refer to attribute names instead, and use the setattr() function to store a new value on self:

def ParseSymbol(self, regex):
    attributes = ['Section', 'Class', 'Subclass', 'Group', 'Subgroup']

    Pattern_groups = re.search(regex, self.Symbol)

    for group, attr in zip(Pattern_groups.groups(), attributes):
        setattr(self, attr, group)

setattr() lets you set attributes based on a variable, here taking from attributes; there is also a companion getattr() function to retrieve attributes dynamically.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
1

setattr() will set the attributes of an object based on a string name. You can rewrite ParseSymbol above:

    def ParseSymbol(self,regex):

        Properties_Pointers = ['Section','Class','Subclass','Group','Subgroup']

        Pattern_groups = re.search(regex, self.Symbol)

        i = 0
        for group in Pattern_groups.groups():
            setattr(self, Properties_Pointers[i], group)
            i += 1

As a side note, you can iterate over both Pattern_groups.groups() and Pattern_Pointers simultaneously by using zip(). This cleans up the code by removing the index variable i and its incrementation:

        for pointer, group in zip(Properties_Pointers, Pattern_groups.groups()):
            setattr(self, pointer, group)
space
  • 377
  • 2
  • 6
0

If you know that your regex will always contain the same number of groups, you can just use tuple unpacking:

self.Section, self.Class, self.Subclass,self.Group, self.Subgroup = Pattern_groups.groups()
dorian
  • 5,667
  • 1
  • 19
  • 36
  • Except that the regular expression may contain fewer groups than attributes? – Martijn Pieters Oct 03 '13 at 07:40
  • Yes, but in that case the original code would also assign the matching groups to the wrong attributes so I assume that all the groups will match for the expected input. – dorian Oct 03 '13 at 07:46
  • Why would it do that? If the regex has 3 groups, then only the first three attributes are assigned to. – Martijn Pieters Oct 03 '13 at 07:48
  • I was talking about the case where some groups might not match, e.g. `r'(SECTION:.*?)?(CLASS:.*?)?(SUBCLASS:.*?)?'`.But if there are less actual groups in the regex than there are properties, then of course tuple unpacking won't work. – dorian Oct 03 '13 at 07:55
  • Clearly the original code did not use named groups; it was expecting to work with positional groups only. :-) – Martijn Pieters Oct 03 '13 at 07:56
  • You are right, but `r'(SECTION:.*?)?(CLASS:.*?)?(SUBCLASS:.*?)?'` does not contain any named groups :). Acually I just realized that non-matching groups are returned as `None` by `groups()`, so it's really just a matter of the number of groups in the regex. – dorian Oct 03 '13 at 08:00
  • Ah, my mistake; I thought you were using a psuedo-regex syntax to name groups, but you meant to show empty groups instead. Yes, optional groups are *always* returned, even if empty. – Martijn Pieters Oct 03 '13 at 08:01