176

I can do such thing in python:

l = ['one', 'two', 'three']
if 'some word' in l:
   ...

This will check if 'some word' exists in the list. But can I do reverse thing?

l = ['one', 'two', 'three']
if l in 'some one long two phrase three':
    ...

I have to check whether some words from array are in the string. I can do this using cycle but this way has more lines of code.

James Wierzba
  • 16,176
  • 14
  • 79
  • 120
Max Frai
  • 61,946
  • 78
  • 197
  • 306

4 Answers4

392
if any(word in 'some one long two phrase three' for word in list_):
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
  • 24
    @Ockonal: and if you want to check that **all** words from that list are inside the string, just replace `any()` above with `all()` – Nas Banov Jul 17 '10 at 23:23
  • 30
    Note that if 'me' is in `list_`, it will count as a match, since 'me' is in 'some'. If you want to match whole words only, you'll need to change to `any(word in 'some one long two phrase three'.split() for word in list_)`, as I did when creating the sets in my answer. – PaulMcG Jul 18 '10 at 05:53
  • @NasBanov what should I do if want to count the number of matches between the list and the string? – Ved Gupta Aug 11 '15 at 05:48
  • 1
    @VedGupta, use `len` instead of `any`? :) https://docs.python.org/3/library/functions.html#len – Nas Banov Aug 12 '15 at 03:53
  • 1
    This only worked for me when I used a list comprehension: `any([word in 'some one long two phrase three' for word in list_])` which is what I would expect - not sure how it worked without that. – Alex Petralia Sep 11 '15 at 15:29
  • @kennytm how could I run this if the "some one long two phrase three" would be an array containing strings instead? , So array with filter words to go over array containing lots of strings. Meh nvm, putting it all in loop did the trick. I thought I needed more. Thanks for this awesome answer! – Dariusz Jan 07 '17 at 20:25
  • How to find which particular values from list of the string matched ? – nlogn Jan 14 '17 at 11:16
  • 5
    @nlogn: `words = [word for word in list_ if word in 'long phrase']` (or use `filter`). – kennytm Jan 14 '17 at 12:07
33

Here are a couple of alternative ways of doing it, that may be faster or more suitable than KennyTM's answer, depending on the context.

1) use a regular expression:

import re
words_re = re.compile("|".join(list_of_words))

if words_re.search('some one long two phrase three'):
   # do logic you want to perform

2) You could use sets if you want to match whole words, e.g. you do not want to find the word "the" in the phrase "them theorems are theoretical":

word_set = set(list_of_words)
phrase_set = set('some one long two phrase three'.split())
if word_set.intersection(phrase_set):
    # do stuff

Of course you can also do whole word matches with regex using the "\b" token.

The performance of these and Kenny's solution are going to depend on several factors, such as how long the word list and phrase string are, and how often they change. If performance is not an issue then go for the simplest, which is probably Kenny's.

Dave Kirby
  • 25,806
  • 5
  • 67
  • 84
  • Thanks for such answer. And, please, add quote after `list_of_words` at second line. – Max Frai Jul 18 '10 at 05:39
  • just tried the last one in python 3.3 I had to use `if word_set.intersection(phrase_set):` – user3271518 Dec 03 '15 at 19:29
  • @dave which is a more efficient way if my list of words is going to be 30-50 words long, and my strings will be upto 300 words. And i have to do upwards of 100k such comparisons? – ketanbhatt Dec 23 '15 at 16:15
  • 1
    @ketanbhatt It will depend on a number of factors. Do you need to match whole words? Will a large proportion of strings have no matches? Will some words in the list appear more often than others? You need to time each of the alternatives on a representative subset of the strings to see which one performs best. – Dave Kirby Dec 24 '15 at 13:45
  • 2
    for the whole word matches with the "\b" token: `words_re = re.compile(r"\b" + r"\b|".join(list_of_words)+r"\b")` – datapug Feb 03 '19 at 16:48
26

If your list of words is of substantial length, and you need to do this test many times, it may be worth converting the list to a set and using set intersection to test (with the added benefit that you wil get the actual words that are in both lists):

>>> long_word_list = 'some one long two phrase three about above along after against'
>>> long_word_set = set(long_word_list.split())
>>> set('word along river'.split()) & long_word_set
set(['along'])
PaulMcG
  • 62,419
  • 16
  • 94
  • 130
  • 1
    That won't be the same as it just checks if space separated words match the words you are looking for. You won't be able to find `foo` within `foobar` for example. – poke Jul 17 '10 at 13:41
  • 1
    @poke - True. It's not clear to me whether the OP wants such partial/embedded word matches or not. As often as not, people write code testing for a word within a larger string of words, assuming they are doing word matching but in fact are doing string matching. This method checks whole words against a set of whole words, without looking for any embedded matches (such as matching 'out' in 'about'). – PaulMcG Jul 18 '10 at 05:50
  • Yeah sure, I just thought it might be important to mention that your solution (which is a good one btw.) does not behave the same as the `in` operator. – poke Jul 19 '10 at 16:07
4

Easiest and Simplest method of solving this problem is using re

import re

search_list = ['one', 'two', 'there']
long_string = 'some one long two phrase three'
if re.compile('|'.join(search_list),re.IGNORECASE).search(long_string): #re.IGNORECASE makes the search case-insensitive
    # Do Something if word is present
else:
    # Do Something else if word is not present
Huge
  • 661
  • 7
  • 14
Anurag Misra
  • 1,516
  • 18
  • 24