How to delete item in nested list if it contains keyword?

Question

I have 2 lists. The list named "keyword" is a list I manually created, and the nested list named "mylist" is an output of a function that I have in my script. This is what they look like:

keyword = ["Physics", "Spanish", ...]

mylist = [("Jack","Math and Physics"), 
          ("Bob","English"), 
          ("Emily","Physics"), 
          ("Mark","Gym and Spanish"),
          ("Brian", "Math and Gym"),
          ...]

What I am trying to do is delete each item in the nested list if that item (in parenthesis) contains any of the keywords written in the "keyword" list.

For example, in this case, any items in "mylist" that contain the words "Physics" or "Spanish" should be deleted from "mylist". Then, when I print "mylist", this should be the output:

[("Bob","English"), ("Brian", "Math and Gym")]

I tried searching through the internet and many different SO posts to learn how to do this (such as this), but when I modify (because I have a nested list, instead of just a list) the code and run it, I get the following error:

Traceback (most recent call last):
  File "namelist.py", line 165, in <module>
    asyncio.get_event_loop().run_until_complete(request1())
  File "C:\Users\XXXX\AppData\Local\Programs\Python\Python37\lib\asyncio\base_events.py", line 576, in run_until_complete
    return future.result()
  File "namelist.py", line 154, in request1
    mylist.remove(a)
ValueError: list.remove(x): x not in list

Does anyone know how to fix this error, and could you share your code?

EDIT: By the way, the real "mylist" I have on my script is much longer than what I wrote here, and I have about 15 keywords. When I run it on a small scale like this, the code works well, but as soon as I have more than 5 keywords, for some reason, I keep getting this error.

I assume you only care if any of the words in the second item in a tuple are a keyword, but you should probably be explicit about that. — Turksarama, Jul 30 '19 at 01:15

benvc · Accepted Answer · 2019-07-30T02:05:23.847

4

You could join each of the tuples into a string and then check if any keyword is in the string to filter your list.

newlist = [m for m in mylist if not any(k for k in keywords if k in ' '.join(m))]

print(newlist)
# [('Bob', 'English'), ('Brian', 'Math and Gym')]

edited Jul 30 '19 at 02:05

answered Jul 30 '19 at 01:09

benvc

14,448
4
33
54

score 1 · Answer 2 · answered Jul 30 '19 at 01:06

1

for key in keyword:
  for tup in mylist:
    [mylist.remove(tup) for i in tup if key in i]

answered Jul 30 '19 at 01:06

The Highlight Hub

21
1
4

score 1 · Answer 3 · answered Jul 30 '19 at 01:07

1

You can start by splitting the fields with and and looking at intersection between the keys and the fields of each person. For instance, you could imagine something like this:

new_list = []

for name,fields in mylist:
    # Convert the string into a set of string for intersection
    field_set = set(fields.split(" and "))
    field_in_keys = field_set.intersection(keyword)

    # Add in the new list if no intersection is found
    if len(field_in_keys) == 0:
        new_list.append((name,fields))

You get:

[('Bob', 'English'), ('Brian', 'Math and Gym')]

If you care for speed, then pandas might do the work more efficiently

answered Jul 30 '19 at 01:07

Nakor

1,484
2
13
23

This works great for my example, however, I think benvc's answer is a little bit more robust since it doesn't specifically require the word "and" to be in the list. Thanks for the answer though, upvoted! – F16Falcon Jul 30 '19 at 01:24
I prefer `if not field_in_keys` instead of `if len(field_in_keys) == 0` here. – RoadRunner Jul 30 '19 at 02:13

score 1 · Answer 4 · answered Jul 30 '19 at 01:29

1

for x in keyword:
    for i in mylist:
        for w in i[1].split(' '):
            if w == x:
                mylist.remove(i)

If you just loop through each word I think that will work as well.

answered Jul 30 '19 at 01:29

Senrab

257
1
13

How to delete item in nested list if it contains keyword?

4 Answers4