I want to remove stop words in a given list from the list of words that I created splitting a text by space to count top most frequent words. However not all stop words are removed I do not understand why.
I defined a function (split_into_words) to split text x into words using re.split(" ", x):
wordsList= split_into_words(x)
wordsList = [item.replace("\n"," ") for item in wordsList]
stopwords = open('stopword.txt') .read()
new_list = []
for w in wordsList:
if not w.lower () in stopwords:
new_list.append(w)
print(new_list)
The list still includes many stop words and they appear among frequent 15 (among them of, by, the and other words).