5

I have a list of strings that i would like to search for a word combination. Then delete the list if the combination is not there. Is there a python list comprehension that would work?

word_list = ["Dogs love ice cream", "Cats love balls", "Ice cream", "ice cream is good with pizza", "cats hate ice cream"]

keep_words = ["Dogs", "Cats"] 

Delete_word = ["ice cream"]

Delete words that have ice cream in it but if dogs or cats is in the sentence keep it.

Desired_output = ["Dogs love ice cream", "Cats love balls", "cats hate ice cream"] 

Was trying this code also tried AND and OR but cannot get the combination right.

output_list = [x for x in  word_list if "ice cream" not in x]
orthoeng2
  • 140
  • 1
  • 6

3 Answers3

7

Here's a list comprehension solution:

[x for x in word_list if any(kw.lower() in x.lower() for kw in keep_words) 
 or all(dw.lower() not in x.lower() for dw in Delete_word)]
# ['Dogs love ice cream', 'Cats love balls', 'cats hate ice cream']

This also adds flexibility for multiple words in the delete words list.

Explanation

Iterate over the list and keep the word if either of the following are True:

  • Any of the keep words are in x
  • None of the delete words are in x

I presume from your example that you wanted this to be case insensitive, so make all comparison on the lower-cased versions of the words.

Two helpful functions are any() and all().

pault
  • 41,343
  • 15
  • 107
  • 149
5

As an optimized approach you can put your keep_word and delete_words within set and use itertools.filterfalse() to filter the list out:

In [48]: def key(x):
             words = x.lower().split()
             return keep_words.isdisjoint(words) or not delete_words.isdisjoint(words)
   ....: 

In [49]: keep_words = {"dogs", "cats"}

In [51]: delete_words = {"ice cream"}

In [52]: list(filterfalse(key ,word_list))
Out[52]: ['Dogs love ice cream', 'Cats love balls', 'cats hate ice cream']
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
Mazdak
  • 105,000
  • 18
  • 159
  • 188
1
>>> list(filter(lambda x: not any(i in x for i in Delete_word)
...                       or  any(i in x for i in keep_words), word_list))
['Dogs love ice cream', 'Cats love balls', 'Ice cream']

Modify this accordingly for a case-insensitive implementation.

AGN Gazer
  • 8,025
  • 2
  • 27
  • 45