I'm doing some topic modeling and am looking to store some of the results of my analysis.
import pandas as pd, numpy as np, scipy
import sklearn.feature_extraction.text as text
from sklearn import decomposition
descs = ["You should not go there", "We may go home later", "Why should we do your chores", "What should we do"]
vectorizer = text.CountVectorizer()
dtm = vectorizer.fit_transform(descs).toarray()
vocab = np.array(vectorizer.get_feature_names())
nmf = decomposition.NMF(3, random_state = 1)
topic = nmf.fit_transform(dtm)
topic_words = []
for t in nmf.components_:
word_idx = np.argsort(t)[::-1][:20]
topic_words.append(vocab[i] for i in word_idx)
for t in range(len(topic_words)):
print("Topic {}: {}\n".format(t, " ".join([word for word in topic_words[t]])))
Prints:
Topic 0: do we should your why chores what you there not may later home go
Topic 1: should you there not go what do we your why may later home chores
Topic 2: we may later home go what do should your you why there not chores
I'm trying to write those topics to a file, so I thought storing them in a list might work, like this:
l = []
for t in range(len(topic_words)):
l.append([word for word in topic_words[t]])
print("Topic {}: {}\n".format(t, " ".join([word for word in topic_words[t]])))
But l
just ends up as an empty array. How can I store these words in a list?