How can I solve this list.remove() error?

Question

I have two inputs (Text - a string, L1- List of strings to be excluded).

I have converted the 'Text' into a list and extracted each word and put it into a list using the following code:

Text=list(Text.split())

Now I have to remove the words present in the L1 list from this 'Text' list. To do so, I used the following code:

for x in Text:
        if(x in L1):
            Text.remove(x)
print(Text)

Inputs:

Text = "jack and jill went to the market to buy bread and cheese cheese is jack favorite food"

L1 = ["and","he","the","to","is"]

Desired Output:

['jack', 'jill', 'went', 'market', 'buy', 'bread', 'cheese', 'cheese', 'jack', 'favorite', 'food']

Actual Output:

['jack', 'jill', 'went', 'the', 'market', 'buy', 'bread', 'cheese', 'cheese', 'jack', 'favorite', 'food']

Please tell me why is 'the' still present in the 'Text' ?

What did I do wrong? What should I do to get my desired result?

Text = [x for x in Text if x not in L1] – Amit Gupta Apr 15 '19 at 11:06 — Amit Gupta, Apr 15 '19 at 11:06

Sreeram TP · Accepted Answer · 2019-04-15T11:45:38.277

You can simply use a list comprehension like this to get desired output

Text = "jack and jill went to the market to buy bread and cheese cheese is jack favorite food"

L1 = ["and","he","the","to","is"]

Text= Text.split()

removed = [x for x in Text if x not in L1]

print(removed)

# Output : ['jack', 'jill', 'went', 'market', 'buy', 'bread', 'cheese', 'cheese', 'jack', 'favorite', 'food']

The reason your code is not working as intended is you are iterating over the list and at the same time you are altering it, which is something that should not be done.

As @blubberdiblub mentioned in the comments, this code has a time complexity of O(n*m). This can be improved to O(n+m) if we can make sure that there is no repetition in the list L1. For that use set representation of L1.

Note that for large numbers of items in both `x` and `L1` (speaking of - say - thousands of items or so) it can be better performance-wise to get a `set()` representation of `L1` before repeatedly doing `in` checks on it. That will reduce time complexity from **O( n \* m )** to **O( n + m )**. — blubberdiblub, Apr 15 '19 at 11:13

score 1 · Answer 2 · answered Apr 15 '19 at 11:04

1

The reason that this isn't working is that you're modifying the list as you're iterating over it, which doesn't work, as you see. One option would be to iterate over a copy of the list, but Sreeram TP's answer is the best approach I think.

answered Apr 15 '19 at 11:04

brunns

2,689
1
13
24

score 1 · Answer 3 · answered Apr 15 '19 at 11:09

You should not mess with a list while you are iterating over that list. In here:

for x in Text:
        if(x in L1):
            Text.remove(x)
print(Text)

When you remove x from your list your for loop then tries to find the next element in Text to loop over, but one was just pulled out from under it so it ends up going one too far, and not looping as you would like. As mentioned in another post you can use a list comprehension or you could save the spots to remove for later removal:

toRemove = []
for x in Text:
        if(x in L1):
            toRemove.append(x)

for x in toRemove:
    Text.remove(x)
print(Text)

But the list comprehension way is much nicer

score 1 · Answer 4 · edited Apr 15 '19 at 11:20

1

The reason that your code is not working is that you are iterating over the list and at the same time making changes in the list.

edited Apr 15 '19 at 11:20

Zain Arshad

1,885
1
11
26

answered Apr 15 '19 at 11:13

S M Vaidhyanathan

320
1
4
13

score 0 · Answer 5 · answered Apr 16 '19 at 12:20

0

Split_text= Text.split() matched= [x for x in Split_text if x not in L1] print(matched)

answered Apr 16 '19 at 12:20

Yash Shukla

141
6

How can I solve this list.remove() error?

5 Answers5