All,
This is a re-post to what I responded to over in this thread. I am getting some totally screwy results with trying to print LSI topics in gensim. Here is my code:
try:
from gensim import corpora, models
except ImportError as err:
print err
class LSI:
def topics(self, corpus):
tfidf = models.TfidfModel(corpus)
corpus_tfidf = tfidf[corpus]
dictionary = corpora.Dictionary(corpus)
lsi = models.LsiModel(corpus_tfidf, id2word=dictionary, num_topics=5)
print lsi.show_topics()
if __name__ == '__main__':
data = '../data/data.txt'
corpus = corpora.textcorpus.TextCorpus(data)
LSI().topics(corpus)
This prints the following to the console.
-0.804*"(5, 1)" + -0.246*"(856, 1)" + -0.227*"(145, 1)" + ......
I would like to be able to print out the topics like @2er0 did over here but I am getting results like these. See below and note that the second item that is printed is a tuple and I have no idea where it came from. data.txt is a text file with several paragraphs in it. That is all.
Any thoughts on this would be fantastic! Adam