LDA: about predicting topics for a new document

Asked Oct 21 '14 at 13:40

Active Oct 28 '14 at 04:43

Viewed 381 times

It's known that the LDA topic modeling learns two matrices of probabilities from data, one is a k x V matrix about P(w|z) values, and the other is a D x k matrix about P(z|d), where k is the number of topics, V is the vocabulary size, and D is the size of training documents.

After reading a former question, I learned that the methods mentioned in the paper are all quite difficult. However, a simple method under the assumption of independence like Naive Bayes can be quickly derived as follows, and the probabilities are all known after training.

p(zi | w1, ..., wn) ∝ p(w1, ..., wn | zi) * p(zi) = (Π p(wj | zi))*p(zi) for 1 <= i <= k ---(1)

p(zi) ∝ Σp(zi | dj) for 1 <= j <= D (under the assumption that all p(dj) are equal ) ---(2)

(a). Are there some errors or problematic assumptions in this derivation?

(b). Are there any papers on the Internet that have discussed about the performance of a similar simple method like this compared to other rigorous ones based on importance sampling, left-to-right estimator, etc.?

edited May 23 '17 at 11:47

Community

asked Oct 21 '14 at 13:40

Tom

3,168
5
27
36

Did you ever figure our your questions? If so, I'd appreciate your insight on part (a). – zzhengnan May 22 '17 at 03:59

LDA: about predicting topics for a new document

0 Answers0