hi everybody i want to remove stop words in a text file without using nltk. I have a text file has stop words list for stopping , i want use the stop words list mentioned above. thank you
Asked
Active
Viewed 902 times
-2
-
Try to, [read](https://www.w3schools.com/python/python_file_open.asp) file, [split](https://www.w3schools.com/python/ref_string_split.asp) by words, [filter](https://stackoverflow.com/questions/12934190/is-there-a-short-contains-function-for-lists) against stop list and [write](https://www.w3schools.com/python/python_file_write.asp) to another file. – Yevhen Bondar Dec 28 '21 at 19:31
-
2Welcome to [Stack Overflow.](https://stackoverflow.com/ "Stack Overflow") Please be aware this is not a code-writing or tutoring service. We can help solve specific, technical problems, not open-ended requests for code or advice. Please edit your question to show what you have tried so far, and what specific problem you need help with. See the [How To Ask a Good Question](https://stackoverflow.com/help/how-to-ask "How To Ask a Good Question") page for details on how to best help us help you. – itprorh66 Dec 28 '21 at 19:32
1 Answers
0
Although hard to understand the exact requirements, I would do something as follows:
with open("stopwords.txt") as f:
stopwords = f.read().splitlines() # Contains "and" and "or" on different lines
text = "Foo and bar or foo"
tokens = text.split() # Split into list of words
for word in tokens:
if word.lower() in stopwords: # If word in stopwords remove it
tokens.remove(word)
clean_text = " ".join(word for word in tokens) # Join words into a string
print(clean_text) # Outputs: "Foo bar foo"

Sam
- 773
- 4
- 13