1

I want to do like all all() like comparison to test if my substring is in every element in the list.

Some dummy data:

let my_list = ['~/.tmp/myproject/filea', '~/.tmp/myproject/fileb']

I want to test if .tmp/myproject/ is in every item in this list.

ThorSummoner
  • 16,657
  • 15
  • 135
  • 147

2 Answers2

3
all(['mysubstring' in item for item in my_list])

List comprehension is perhapse the best way to do this kind of check, and best of all you can still use all!

Python 2.7.6 (default, Mar 22 2014, 22:59:56)
Type "help", "copyright", "credits" or "license" for more information.
>>> my_list = ['~/.tmp/myproject/filea', '~/.tmp/myproject/fileb']
>>> my_list
['~/.tmp/myproject/filea', '~/.tmp/myproject/fileb']
>>> [item for item in my_list]
['~/.tmp/myproject/filea', '~/.tmp/myproject/fileb']
>>> ['/.tmp/myproject/' in item for item in my_list]
[True, True]
>>> all(['/.tmp/myproject/' in item for item in my_list])
True
ThorSummoner
  • 16,657
  • 15
  • 135
  • 147
  • 3
    *List comprehension is perhapse the best way*, - generator expression is even better. – vaultah Mar 06 '15 at 19:52
  • 1
    It would be interesting to see a path-optimized answer. I left this answer string-optimized for general purpose use. – ThorSummoner Mar 06 '15 at 19:53
  • 1
    @BhargavRao absolutely :) – ThorSummoner Mar 06 '15 at 19:55
  • 2
    I always get excited when I see `all` with a practical use, +1 – HavelTheGreat Mar 06 '15 at 19:58
  • @61612 I love the idea of using a generator, particularly if the dataset is huge and it could be used to break early (or rather StopIteration early). I struggle to find a way to do it in an inline statement, [Generator Comprehension](http://stackoverflow.com/a/364824/1695680) seems to be limited; Which would be the ideal way to do this check in an if clause, unless I actually did have a dataset that was huge and there was a non-negligible gain by using a bulkier generator call. – ThorSummoner Mar 06 '15 at 20:12
  • 3
    You should always use a generator expression, there is no point creating a list to throw it away. The point of `all` is it shorts circuits if you find any condition that does not meet the requirement and lazily evaluates. thus not creating a full list of elements. – Padraic Cunningham Mar 06 '15 at 20:26
  • @PadraicCunningham I learn primarily by example, would you be able to post an answer that exhibits your point? – ThorSummoner Mar 06 '15 at 20:36
  • 1
    @ThorSummoner Simply remove the square braces: `all('mysubstring' in item for item in my_list)` – jedwards Mar 06 '15 at 20:36
  • 1
    It is exactly the same just remove the `[]` and then you remove the list creation and lazily evaluate – Padraic Cunningham Mar 06 '15 at 20:36
2

Some timings show exactly why you should use a generator expression:

In [25]:  l = ["foobar" for _ in range(100000)]

In [26]:  l = ["foobar" for _ in range(100000)]

In [27]: l =  ["doh"] + l

In [28]: timeit all("foo" in s for s in l)
1000000 loops, best of 3: 541 ns per loop

In [29]: timeit all(["foo" in s for s in l])
100 loops, best of 3: 6.49 ms per loop

There are some nice example of how to use generators and their advantages here wiki.python.org/moin/Generators

I think a good rule of thumb is if you only plan on using the set of elements generated once or the data is too large to fit in memory use a generator expression. If you want to use the elements again and the size is practical then store them in a list.

Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • 1
    Maybe it would be better if a link where gen exp is explained clearly is provided so that the OP can understand it – Bhargav Rao Mar 06 '15 at 20:41