Python: for loop skipping array elements

Question

I am processing data from a JSON file for machine learning. The data are sentences. The sentences are read into an array and tokenized using NLTK perfectly. So in each sentence array, I am left with something like this ['set', 'a', 'timer', 'for', '*int', '*unit_of_time'], which is totally correct. I would like to remove all elements that contain a ''. This works correctly 90% of the time, but I find that if there are two elements containing an '' in succession, the second element is left behind. So if I run:

words = ['set', 'a', 'timer', 'for', '*int', '*unit_of_time']
words = nltk.word_tokenize(pattern)
    for word in words:
        if '*' in word:
            words.remove(word)

I am left with words = ['set', 'a', 'timer', 'for', '*unit_of_time'], but should be left with `words = ['set', 'a', 'timer', 'for'] The loop successfully removes '*int', but not '*unit_of_time'.

Am I doing this incorrectly? I am using Python 3.7 on Ubuntu 19.10.

If I can provide any additional information, please let me know.

Don't change the length of a list while iterating over it... — jonrsharpe, Apr 06 '20 at 18:45
Use a list comprehension. `words = [word for word in words if '*' not in word]` — Axe319, Apr 06 '20 at 18:50
To expand on this, when you iterate over a list, it looks at the 1st element, then the second, etc. If you remove the second element and your list now has one less element, it will never iterate over the one that got "bumped" down to the element you removed. (3rd, which is now the 2nd) — Axe319, Apr 06 '20 at 18:57
You can read better answers here. https://stackoverflow.com/questions/1207406/how-to-remove-items-from-a-list-while-iterating — Axe319, Apr 06 '20 at 19:00

Python: for loop skipping array elements

0 Answers0