1

I have a large list that looks something like this:

entries = ["['stuff']...other stuff", "['stuff']...stuff", "['stuff']...more stuff", ...]

I want to remove all elements of the list that don't contain the words "other" or "things".

I tried this but it isn't removing all of the elements I need it to (only some near the end):

for e in entries:
    if 'other' or 'things' not in e:
        entries.remove(e)
print entries

What am I doing wrong?

curious_cosmo
  • 1,184
  • 1
  • 18
  • 36
  • Also, note, [this](https://stackoverflow.com/questions/1157106/remove-all-occurrences-of-a-value-from-a-list) will be your next bug... – juanpa.arrivillaga Aug 28 '17 at 23:34
  • Who reopened this? This is **clearly a duplicate** of [this](https://stackoverflow.com/questions/15112125/how-do-i-test-one-variable-against-multiple-values) – juanpa.arrivillaga Aug 28 '17 at 23:37

3 Answers3

2

You shouldn't be removing items from a list while iterating over it. Also, your conditional statement doesn't do what you mean: it checks 'other' for truthiness and only 'things' for containment. To fix it, use and with two separate in checks.

If the list is not very big, you could just use a list comprehension to rebuild it:

entries = [e for e in entries if "other" not in e and "things" not in e]

Otherwise, loop from the end of the list to the beginning and remove items by indexes.

for i in range(len(entries)-1, -1, -1):
    if "other" in entries[i] and "things" in entries[i]:
        del entries[i]
Eugene Yarmash
  • 142,882
  • 41
  • 325
  • 378
  • 1
    Well, both are pretty big problems. But the proximal cause of the issue is the conditional. And following close behind will be the modification of the list during iteration. For the record, I didn't downvote... – juanpa.arrivillaga Aug 28 '17 at 23:33
0

As others have already pointed out, in your version there are three main problems:

for e in entries:
    if 'other' or 'things' not in e: #or returns first truthy value, and `if other` is always true.  Also, you need and, not or.
        entries.remove(e) #mutating the item you are iterating over is bad
print entries

Here is your version, revised to fix the above problems:

for e in words[:]: #words[:] is a copy of words, solves mutation issue while iterating
    if 'other' not in e and 'things' not in e: #want words that both don't contain 'other' AND dont contain 'things'
        print(e)
        words.remove(e)
print(words)

And here are some alternative ways to do this:

import re

words = ['this doesnt contain chars you want so gone',
         'this contains other so will be included',
         'this is included bc stuff']

answer = list(filter(lambda x: re.search('other|stuff',x),words))
other_way = [sentence for sentence in words if re.search('other|stuff',sentence)]

print(answer)
print(other_way)
Solaxun
  • 2,732
  • 1
  • 22
  • 41
  • I didn't downvote, but this answer is of poor quality. It simply provides an alternative method, with *no explanation* of why the original method was wrong, or how your alternatives actually work, which if the question is about basic conditionals, then this won't be very helpful. – juanpa.arrivillaga Aug 28 '17 at 23:36
0

You may use the list comprehension expression using all(..) to check for the substring as:

>>> [entry for entry in entries if any(something in entry  for something in  ["other", "things"])]

This will return you the new list of words containing either "other" or "things".

Moinuddin Quadri
  • 46,825
  • 13
  • 96
  • 126