1

I have datalist and filterlist and I want to use list comprehension method to search any item in datalist which its string contains any word from the filterlist:

>>> datalist=['mytest123orange','dark angle','double69','spartan','broken image 999','2 cup of tea'] 
>>> filterlist=['test','2','x','123','orange']
>>> print [i for i in datalist if len([ j for j in filterlist if j in i])>0 ]
['test123orange', '2 cup of tea']

It's working as i want. But the problem is that to get the value from len([ j for j in filterlist if j in i ])>0, it will need to loop all the item inside the filterlist. So even if it match the first item in filterlist, the loop will have to go through till the end. For example when try to check the 'mytest123orannge', if the test in filterlist already match it then it's enough, I want to 'break' the loop so I don't want to loop for the rest. So I don't need to match for 'orange' or '2' or '123'.

My questions :

  1. How can I break inside that loop?
  2. Is there any other better method?
andio
  • 1,574
  • 9
  • 26
  • 45
  • 2
    You can't `break` inside a list comprehension, but what about `any(j in i for j in filterlist)`? That will short-circuit when it finds a match. – jonrsharpe Aug 20 '17 at 08:54
  • @jonrsharpe it still will loop the whole filterlist , even the match is already found. – andio Aug 20 '17 at 09:04
  • @andio no it won't... that's the purpose of `any` – Jon Clements Aug 20 '17 at 09:04
  • 1
    @andio no, it won't, see the equivalent Python in the docs: https://docs.python.org/3/library/functions.html#any – jonrsharpe Aug 20 '17 at 09:05
  • @jonrsharpe Probably a good idea to recommend OP [this](https://stackoverflow.com/a/11016430/4909087). – cs95 Aug 20 '17 at 09:05
  • @cᴏʟᴅsᴘᴇᴇᴅ overkill for this case... a simple `any(re.search('orange|test|123|2|x', el) for el in datalist)` will do (and it's fairly clear that `123` is redundant here...) – Jon Clements Aug 20 '17 at 09:10
  • 2
    @cᴏʟᴅsᴘᴇᴇᴅ rather than telling me it's probably a good idea to recommend something, why don't you just... recommend it? – jonrsharpe Aug 20 '17 at 09:16
  • @jonrsharpe Thanks a lot , i got it now. – andio Aug 20 '17 at 09:58

2 Answers2

12

use any() with a generator

filterList=['test','2','x','123','orange']
print ([i for i in datalist if any(j for j in filterList if j in i) ])

any stops the iterations when the first element is found

Uri Goren
  • 13,386
  • 6
  • 58
  • 110
-1

Uri's answer were correct, there would be a maximum of one element in the result list:

print [i for i in datalist if any(j for j in filterList if j in i) ]

['mytest123orange', '2 cup of tea']

If no matches were found, the result list would be empty.

The conclusion is that any() does not cause a 'short-circuit', rather the entire generator is exhausted prior to any() being applied.

Ru Chern Chong
  • 3,692
  • 13
  • 33
  • 43
  • Besides not being an answer, the conclusion is wrong. The generator is exhausted **while** `any` is being applied and only if no matches are found. – Stop harming Monica Jun 20 '19 at 17:23