I am trying to write to file a list of stop words from NLTK.
So, I wrote this script:
import nltk
from nltk.corpus import stopwords
from string import punctuation
file_name = 'OUTPUT.CSV'
file = open(file_name, 'w+')
_stopwords = set(stopwords.words('english')+list(punctuation))
i = 0
file.write(f'\n\nSTOP WORDS:+++\n\n')
for w in _stopwords:
i=i+1
out1 = f'{i:3}. {w}\n'
out2 = f'{w}\n'
out3 = f'{i:3}. {w}'
file.write(out2)
print(out3)
file.close()
The original program used file.write(w)
, but since I encountered problems, I started trying things.
So, I tried using file.write(out1)
. That works, but the order of the stop words appear to be random.
What's interesting is that if I use file.write(out2)
, I only write a random number of stop words that appear to show up in random order, always short of 211. I experience the same problem both in Visual Studio 2017 and Jupyter Notebook.
For example, the last run wrote 175 words ending with:
its
wouldn
shan
Using file.write(out1)
I get all 211 words and the column ends like this:
209. more
210. have
211. ,
Has anyone run into a similar problem. Any idea of what may be going on?
I'm new to Python/NLTK so I decided to ask.