import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
from gensim import corpora, models, similarities
from nltk.corpus import stopwords
import codecs
documents = []
with codecs.open("Master_File_for_Docs.txt", encoding = 'utf-8', mode= "r") as fid:
for line in fid:
documents.append(line)
stoplist = []
x = stopwords.words('english')
for word in x:
stoplist.append(word)
#Removes Stopwords
texts = [[word for word in document.lower().split() if word not in stoplist]
for document in documents]
dictionary = corpora.Dictionary(texts)
corpus = [dictionary.doc2bow(text) for text in texts]
lda = models.LdaModel(corpus, id2word=dictionary, num_topics=100)
lda.print_topics(20)
#corpus_lda = lda[corpus]
#for doc in corpus_lda:
# print(doc)
I am running Gensim for topic modeling and trying to get the above code working. I know that this code works because my friend ran it from a mac computer and it worked successfully but when I run it from a windows computer the code gives me a
MemoryError
Also the logging that I set on the second line also doesn't appear on my windows computer. Is there something in Windows that I need to fix in order for gensim to work?