What’s next after Topic modelling in LDA

Question

I’m new to topic modelling.

So I hope someone experienced can answer my queries. Here’s a simplified format of my data: 1. I have a csv file of dimension of 1000*2. (mixture of topics) 2. Each row is a document and a document ID. each document can have multiple lines, and the document can be smth like: eg- the movie is about Harry Potter. I like to watch.

So, I wanted to find the natural clusters/ topics from the topic models, and manually assign the labels to the clusters based on the TOP terms.

So I spilt each document into individual tokens and used LDA, then used the lowest perplexity score to get the optimal cluster.

After using LDA, I plotted the Visualizations of the most occurring terms for each topic.

However, 1. I’m not sure if I should do a bi/n gram- if so how to do it? Because I know that there are some terms which must occur together. 2. Do I have to use network graph to see how the different terms correlate to each other? Or different topics link together? 3. Not too sure if I’m doing the right way

I'm voting to close this question as off-topic because not at all about programming — camille, Jun 13 '18 at 12:45
@camille, it’s about programming. Because would appreciate if someone can guide the R steps to do the subsequent analysis — R_abcdefg, Jun 13 '18 at 16:29
Please [see here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for how to post an R question that folks can answer. That includes posting data and code that you've already written, and a detailed, specific question you're trying to solve. What you're looking for is a broader tutorial, which SO can't provide — camille, Jun 13 '18 at 16:35

What’s next after Topic modelling in LDA

0 Answers0