I would like to remove the stopwords that are in a list of a list while keeping the format the same (i.e. a list of a list)
Following is the code that I have already tried
sent1 = 'I have a sentence which is a list'
sent2 = 'I have a sentence which is another list'
from nltk.corpus import stopwords
stop_words = stopwords.words('english')
lst = [sent1, sent2]
sent_lower = [t.lower() for t in lst]
filtered_words=[]
for i in sent_lower:
i_split = i.split()
lst = []
for j in i_split:
if j not in stop_words:
lst.append(j)
" ".join(lst)
filtered_words.append(lst)
Current Output of filtered_words:
filtered_words
[['sentence', 'list'],
['sentence', 'list'],
['sentence', 'another', 'list'],
['sentence', 'another', 'list'],
['sentence', 'another', 'list']]
Desired Output of filtered_words:
filtered_words
[['sentence', 'list'],
['sentence', 'another', 'list']]
I am getting a duplicate of list. What might I be doing wrong in the loop? Also is there a better way of doing this rather than writing so many for loops?