28

I need to check if any of the strings in a list match a regex. If any do, I want to continue. The way I've always done it in the past is using list comprehension with something like:

r = re.compile('.*search.*')
if [line for line in output if r.match(line)]:
  do_stuff()

Which I now realize is pretty inefficient. If the very first item in the list matches, we can skip all the rest of the comparisons and move on. I could improve this with:

r = re.compile('.*search.*')
for line in output:
  if r.match(line):
    do_stuff()
    break

But I'm wondering if there's a more pythonic way to do this.

Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
ewok
  • 20,148
  • 51
  • 149
  • 254
  • Why not use the builtin `any()`? Eg: `if any(re.match(line) for line in output)` – MrAlexBailey Jun 22 '16 at 16:49
  • @Jkdc becasue `any()` takes a list and converts each element into a `bool`, then evaluates the bool. So in order to get it to the point that `any()` would be useful, I'd still have to do the regex match on every element. – ewok Jun 22 '16 at 16:51
  • @ewok: no, `any` takes something which is *iterable*. jkdc's code uses a lazy generator expression, not a list. – DSM Jun 22 '16 at 16:51
  • 1
    @DSM. The lazy generator was what I was looking for. his initial comment didn't include that. – ewok Jun 22 '16 at 16:52
  • aside from using `any` i think you've just answered your question with your second attempt – danidee Jun 22 '16 at 16:52
  • @Jkdc the lazy generator there is what I was looking for. add it as an answer – ewok Jun 22 '16 at 16:52

4 Answers4

34

You can use the builtin any():

r = re.compile('.*search.*')
if any(r.match(line) for line in output):
    do_stuff()

Passing in the lazy generator to any() will allow it to exit on the first match without having to check any farther into the iterable.

MrAlexBailey
  • 5,219
  • 19
  • 30
  • 8
    Any way to access the matched string using this method? I'd like to print it for logging purposes – nat5142 Feb 23 '18 at 18:08
  • 1
    IMHO, instead of `re.match` shouldn't it be `r.match` as OP has created a compiled regular expression object `r = re.compile('.*search.*')` – Kaushik Acharya Dec 06 '20 at 06:47
15

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can also capture a witness of an any expression when a match is found and directly use it:

# pattern = re.compile('.*search.*')
# items = ['hello', 'searched', 'world', 'still', 'searching']
if any((match := pattern.match(x)) for x in items):
  print(match.group(0))
# 'searched'

For each item, this:

  • Applies the regex search (pattern.match(x))
  • Assigns the result to a match variable (either None or a re.Match object)
  • Applies the truth value of match as part of the any expression (None -> False, Match -> True)
  • If match is None, then the any search loop continues
  • If match has captured a group, then we exit the any expression which is considered True and the match variable can be used within the condition's body
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
8

Given that I am not allowed to comment yet, I wanted to provide a small correction to MrAlexBailey's answer, and also answer nat5142's question. Correct form would be:

r = re.compile('.*search.*')
if any(r.match(line) for line in output):
    do_stuff()

If you desire to find the matched string, you would do:

lines_to_log = [line for line in output if r.match(line)]

In addition, if you want to find all lines that match any compiled regular expression in a list of compiled regular expressions r=[r1,r2,...,rn], you can use:

lines_to_log = [line for line in output if any(reg_ex.match(line) for reg_ex in r)]
I. Jovanov
  • 81
  • 1
  • 1
1

In reply to a question asked by @nat5142, in the answer given by @MrAlexBailey: "Any way to access the matched string using this method? I'd like to print it for logging purposes", assuming "this" implies to:

if any(re.match(line) for line in output):
    do_stuff()

You can do a for loop over the generator

# r = re.compile('.*search.*')
for match in [line for line in output if r.match(line)]:
    do_stuff(match) # <- using the matched object here

Another approach is mapping each match with the map function:

# r = re.compile('.*search.*')
# log = lambda x: print(x)
map(log, [line for line in output if r.match(line)])

Although this does not involve the "any" function and might not even be close to what you desire...

I thought this answer was not very relevant so here's my second attempt... I suppose you could do this:

# def log_match(match):
#    if match: print(match)
#    return match  
if any(log_match(re.match(line)) for line in output):
    do_stuff()
tinnick
  • 316
  • 1
  • 2
  • 15