0

I'm setting up a little function to run a configurable regex list over a line, so far I have two regexs, but it seems to only be executing one of the patterns associated with the regex, here's my code.

def run_all_regex(self, line):

regexp: {
    'regex1' : 'complicated regex',
    'regex2' : 'complicated regex2',
}

for key, pattern in regexp.iteritems():

   m = match(pattern, line)

   if m:
      x = line
   else:
      x = None 

  return x

I added a print statement after my for, key... line to see what patterns were being printed over, and it was only the second one! After I removed the second one, the first one printed! What gives?

EDIT:

So I've seen that my return statement has bogged up my function, I'd like to elaborate on what I'm trying to do here.

Basically I am opening a file, reading it line by line, and for each line running this function that will run the two (so far) regexes on the line, and if that line matches either regex, return it to a key in a dict. Here's the code.

for key in dict:
   with open(key) as f:
      for line in f:
          dict[key] = self.run_all_regex(line)

So at the end of it all, dict[key] should be a line that matches the regex that I have in the run_all_regex section.

openingceremony
  • 181
  • 1
  • 3
  • `return` ends the function and thus terminates the loop. What are you intending to accomplish with that `return`? A function can only return one value; you can't return multiple times. – BrenBarn Oct 15 '14 at 22:18
  • Yikes! What a brainfart on my part, I'm trying to return the line that matches the pattern, to a variable. But I'd like to run both regexes on each line that gets passed into the function.. guess I'll need another way around that... – openingceremony Oct 15 '14 at 22:20
  • You can effectively "return multiple times" if you use `yield` instead of `return` but it's unclear whether that's the desired behaviour. – jez Oct 15 '14 at 22:22
  • @jez I seriously doubt the OP is looking for a `yield` statement here. Even if he were it is NOT the same thing as 'returning multiple times', which is a paradoxical expression. – lstyls Oct 15 '14 at 22:34
  • As for _"After I removed the second one, the first one printed!"_ - dicts are not ordered so the second may have been the first one tried. If order is important, try `collections.OrderedDict`. – tdelaney Oct 15 '14 at 23:10

3 Answers3

3

return x is ending your for loop after its first iteration.

ErlVolton
  • 6,714
  • 2
  • 15
  • 26
  • *blushes* As I mentioned in my comment above, if there is a match to either pattern, I'm trying to return it to a variable assignment, is there a way to do this without returning? I'm thinking I'll have to remove this code from a function and run it as is... – openingceremony Oct 15 '14 at 22:22
  • Append matches to a list and return the list – ErlVolton Oct 15 '14 at 22:25
1
def run_all_regex(self, line):
    regexp = {
        'regex1' : 'complicated regex',
        'regex2' : 'complicated regex2',
    }
    results = {}
    for key, pattern in regexp.iteritems():
        m = match(pattern, line)
        if m:
            results[key] = line
        else:
            results[key] = None
    return results

This will return a dictionary with the results of each regex being stored as the value and the keys being the keys from the regex dictionary.

kylieCatt
  • 10,672
  • 5
  • 43
  • 51
0

I am not sure what do you want to achieve, but you can write better code if you can elaborate and explain more what do you exactly expect ?

import re
def run_all_regex(line):
    regexp= {'regex1' : 'complicated regex','regex2' : 'complicated regex2'}

    for key, pattern in regexp.iteritems(): 
        m = re.match(pattern, line)
        if m:
            x = line
        else:
            x = None 

    return x

print run_all_regex('complicated regex')

If you would like to use list comprehension, without returning everytime:

res= [re.match(pattern, line) for x in regexp.iteritems()]

You can think of using `Any Operator or other operator based on your need:

 res= Any(re.match(pattern, line) for x in regexp.iteritems())

I recommend also checking re.match to understand what regular expression is (in case you are testing this algorithm to learn regular expression, and how to match between two items).


Second Answer based on your edited question Let's take this quick and small example to clarify the use of regular expression in matching items

Let's take an example to see if we can verify if two words start with "a".

Pattern :: (a\w+)\W(a\w+)

l= ["animal alfa", "do don't"]

for element in l:
    # Match if two words starting with letter d.
    m = re.match("(a\w+)\W(a\w+)", element)

    # See if success.
    if m:
        print(m.groups())

Expected Output: ('animal', 'alfa')

2- Let's generalize the idea on examining different regular expression

# Split text to store every line in the list.

text= """
dog dot\n say said \n bolo bala \n no match
"""
l= "\n".split(text)
#Output
#l = ["dog dot", "say said", "bolo bala", "no match"]

# Try to make some RE patterns for test 
regex1 : verify if two words starts with d
regex2 : verify if two words starts with s
regex3 : verify if two words starts with b

regexp= {'regex1' : "(d\w+)\W(d\w+)",'regex2' : "(s\w+)\W(s\w+)",'regex3' : "(b\w+)\W(b\w+)"}

# Loop in list of items
for element in l:
    # Match if two words starting with letter d. It produces a list of booleans (True if there is match, False if not, indexed based on the order of regexp)
    m = [re.match(reg, element) for re in regexp.iteritems()]

    # if v is true (the result of re.match(reg, element)), we store the index, and group using tuple
    listReg= [ (i,v.groups()) for i,v in enumerate(m) if v] 

I hope that it will help you understand how to solve your problem ( not sure if I understand everything).

Community
  • 1
  • 1
user3378649
  • 5,154
  • 14
  • 52
  • 76