How to find an index of an item in a list, searching the item with a regular expression in Python?

Question

I have a list like that:

lst = ['something', 'foo1', 'bar1', 'blabla', 'foo2']

Is it possible to get the index of the first item starting with "foo" (foo1) using regular expressions and lst.index() like:

ind = lst.index("some_regex_for_the_item_starting_with_foo") ?

I know I can create a counter and a for loop and use method startswith(). I am curious if I miss some shorter and more elegant way.

score 3 · Accepted Answer · answered Jul 07 '11 at 20:19

I think that it's ok and you can use startswith method if it do what you really want(i am not sure that you really need regEx here - however code below can be easily modified to use regEx):

data = ['text', 'foo2', 'foo1', 'sample']
indeces = (i for i,val in enumerate(data) if val.startswith('foo'))

Or with regex:

from re import match
data = ['text', 'foo2', 'foo1', 'sample']
indeces = (i for i,val in enumerate(data) if match('foo', val))

Andrew Clark · Answer 2 · 2011-07-11T16:23:10.300

There is no way to do it using the lst.index, however here is an alternative method that you may find more elegant than a for loop:

try:
    ind = (i for i, v in enumerate(lst) if v.startswith("foo")).next()
except StopIteration:
    ind = -1   # or however you want to say that the item wasn't found

As senderle pointed out in a comment, this can be shortened by using the next() built-in function (2.6+) with a default value to shorten this to one line:

ind = next((i for i, v in enumerate(lst) if v.startswith("foo")), -1)

score 1 · Answer 3 · answered Jul 07 '11 at 20:28

1

No, unfortunately there is no key parameter for list.index. Having that a solution could have been

# warning: NOT working code
result = L.index(True, key=lambda x: regexp.match(x) is not None)

Moreover given that I just discovered that lambda apparently is considered in the python community an abomination I'm not sure if more key parameters are going to be added in the future.

answered Jul 07 '11 at 20:28

6502

112,025
15
165
265

Don't you think `key` is useful without lambda? With `operator.itemgetter` for example? I'm also curious who thinks `lambda` is an abomination. It can be really ugly, sure, but it's an important part of the language I think, especially when you have a built-in function that doesn't do _quite_ what you want. – senderle Jul 10 '11 at 16:56
@senderle: yes `key` can be useful with other cases but in many common cases using a small anonymous closure is just perfect for `key`. About why `lambda` is so hated I just discovered this recently (at EuroPython) where I was asking why in an example `function.Partial` had been used in a case that should have been a job for `lambda` and Alex Martelli replied <>. See for a longer explanation http://stackoverflow.com/q/3252228/320726 – 6502 Jul 10 '11 at 17:34
thanks, that clears things up for me. I think this is a case where (for me), practicality beats purity. I see AM's side of things, though; I guess I wouldn't cry (too hard) if `lambda` were removed. – senderle Jul 10 '11 at 17:45

senderle · Answer 4 · 2011-07-13T15:25:14.277

0

It would be kind of cool to have something like this built in. Python doesn't though. There are a few interesting solutions using itertools. (These also made me wish for a itertools.takewhile_false. If it existed, these would be more readable.)

>>> from itertools import takewhile
>>> import re
>>> m = re.compile('foo.*')
>>> print len(tuple(itertools.takewhile(lambda x: not m.match(x), lst)))
1

That was my first idea, but it requires you to create a temporary tuple and take its length. Then it occurred to me that you could just do a simple sum, and avoid the temporary list:

>>> print sum(1 for _ in takewhile(lambda x: not m.match(x), lst))
1

But that's also somewhat cumbersome. I prefer to avoid throw-away variables when possible. Let's try this again.

>>> sum(takewhile(bool, (not m.match(x) for x in lst)))
1

Much better.

edited Jul 13 '11 at 15:25

answered Jul 07 '11 at 20:35

senderle

145,869
36
209
233

Your solution is quite extraodinary and at the same time not quite readable, however I got what you did. I guess using "not" instead of using a function takewhile_false is more natural though. The same thing is if there was a while_false loop instead of "while smth != smth2" – rightaway717 Jul 08 '11 at 12:08
I've found "dropwhile" in itertools. I guess this is just what you meant by "takewhile_false" – rightaway717 Jul 09 '11 at 07:40
@rightaway717, no, `dropwhile` _discards_ items until the predicate is true, and then takes the rest, just as `takewhile` _takes_ items until the predicate is true and discards the rest. In other words, given the same iterable and predicate, `takewhile` will yield the first part of the list, and `dropwhile` will yield the second part of the list. – senderle Jul 09 '11 at 15:26
Sorry but this is terrible, you're building a tuple (potentially large) just to compute an index? – alexis Feb 18 '12 at 23:00
@alexis, well, that's why I improved on the first version, as you surely must have seen if you read the whole post. The later versions don't create tuples. I suppose it's possible that `sum` builds a tuple internally -- in which case I have to take issue with the implementation of `sum`. – senderle Feb 19 '12 at 01:20
@senderle: Sorry, I missed the part where you mention the tuple yourself. `takewhile` and `sum` wouldn't build a tuple. – alexis Feb 19 '12 at 09:56

score 0 · Answer 5 · answered Jul 07 '11 at 20:52

0

l = ['something', 'foo1', 'bar1', 'blabla', 'foo2']
l.index(filter(lambda x:x.startswith('foo'),l)[0])

answered Jul 07 '11 at 20:52

Vader

3,675
23
40

I'll keep in mind this solution. I just started learning python and didnt know it means the same as "i for i,val in ...". Now I know that. Thank you for your efforts – rightaway717 Jul 08 '11 at 12:11

How to find an index of an item in a list, searching the item with a regular expression in Python?

5 Answers5