151

I want to filter strings in a list based on a regular expression.

Is there something better than [x for x in list if r.match(x)] ?

leoluk
  • 12,561
  • 6
  • 44
  • 51

3 Answers3

277

Full Example (Python 3):
For Python 2.x look into Note below

import re

mylist = ["dog", "cat", "wildcat", "thundercat", "cow", "hooo"]
r = re.compile(".*cat")
newlist = list(filter(r.match, mylist)) # Read Note below
print(newlist)

Prints:

['cat', 'wildcat', 'thundercat']

Note:

For Python 2.x developers, filter returns a list already. In Python 3.x filter was changed to return an iterator so it has to be converted to list (in order to see it printed out nicely).

Python 3 code example
Python 2.x code example

Mercury
  • 7,430
  • 3
  • 42
  • 54
  • 4
    Hi there, When I run the above code, I get `` What am I doing wrong? –  Oct 13 '16 at 11:08
  • 1
    According to python docs (python 2.7.12): https://docs.python.org/2/library/functions.html#filter filter returns a list not an object. You can also check that code: https://repl.it/X3G/5786 (just hit run) – Mercury Oct 13 '16 at 13:24
  • 1
    Thank you. I am using Python 3.5.2 on a Mac. I tried your link. Of course it works, though not sure why I get that msg. I even removed the `str` since `filter` returns a list anyway, to no avail... –  Oct 14 '16 at 12:25
  • 4
    @joshua you've probably figured this out by now but try `print(list(newlist))` or `print([i for i in newlist])` – James Draper Jan 10 '17 at 18:42
  • 6
    This is ridiculously difficult. This is why R is superior. Simply grep(pattern, vector_of_names) – MadmanLee May 15 '19 at 03:42
  • How do you do this without compiling the regex first? Like `re.match(".*cat", my_string)`. – hobbes3 Jan 27 '20 at 23:22
  • filter(..) call does that internally. we are passing the r.match (pointer to a funciton) and a list of strings to match to. – Mercury Jan 28 '20 at 11:13
  • How would you do this with a regex grouping, when `p = re.compile(r'/simple/(.*)/')`? – not2qubit Dec 27 '21 at 20:11
  • I dont suppose theres a way to also include the index? – john k Jan 14 '22 at 21:24
  • you probably want to use `r.search` instead of match – grantr May 12 '22 at 06:32
155

You can create an iterator in Python 3.x or a list in Python 2.x by using:

filter(r.match, list)

To convert the Python 3.x iterator to a list, simply cast it; list(filter(..)).

Ma0
  • 15,057
  • 4
  • 35
  • 65
sepp2k
  • 363,768
  • 54
  • 674
  • 675
  • 2
    Actually, list comprehensions are usually prefered over functional constructs such as filter, reduce, lambda, etc. – Ivo van der Wijk Sep 04 '10 at 00:41
  • 41
    @Ivo: They are usually preferred because they're usually clearer and often more succinct. However in this case, the `filter` version is perfectly clear and has much less noise. – sepp2k Sep 04 '10 at 00:47
  • 11
    what is `r.match` here? – rbatt Oct 12 '18 at 11:48
  • 4
    @rbatt `r.match` is a method that, when applied to a given string, finds whether the regex `r` matches that string (and returns a corresponding match object if so, but that doesn't matter in this case as we just care whether the result is truthy) – sepp2k Oct 12 '18 at 21:33
  • 1
    Can anybody put example? Where to pass search mask? – don't blink Dec 13 '21 at 06:30
  • 1
    First `import re`, second create your regex `r = re.compile(".*cat")` – Humbert Aug 05 '22 at 11:48
9

To do so without compiling the Regex first, use a lambda function - for example:

from re import match

values = ['123', '234', 'foobar']
filtered_values = list(filter(lambda v: match('^\d+$', v), values))

print(filtered_values)

Returns:

['123', '234']

filter() just takes a callable as it's first argument, and returns a list where that callable returned a 'truthy' value.

Collin Heist
  • 1,962
  • 1
  • 10
  • 21