0

Is there a way how to customize this

stopWords = set(stopwords.words('english'))

or any other way, so I can use a text file with stop-words from my language in Python's NLTK?

If my text file was my_stop_words.txt, how can I say to NLTK to take this set of words instead of set for 'english'?

Thanks a lot!

Erika
  • 13
  • 6
  • you can set stopwords to be a list of any words you want, you just create a list of them and assign it `stopwords = ['foo','bar','baz','non','le']` etc, or you can use an existing set and append to it as in the question I referenced above – G. Anderson Oct 01 '18 at 19:25

1 Answers1

0

Yes you can read in your own file of stopwords although it's also worth saying NLTK comes with multiple languages supported in it's stopwords.

Try something like:

with open("stopwords.txt", "r") as f:
    new_stopwords = []
    for line in f.readlines()
        new_stopwords.append(line)

new_stopwords_set = set(new_stopwords)
Sven Harris
  • 2,884
  • 1
  • 10
  • 20