Why do different runs of the same iteration produce different results?

Question

I've created a dictionary with the document-topic probabilities from a Gensim LDA model. Each iteration over the dictionary (even with the same exact code) produces slightly different values. Why is this? (Note, when the same code is copied and pasted in another jupyter cell)

for r in doc_topics[:2]:
    print(r)

First time produces:

[(5, 0.46771166), (8, 0.09964698), (12, 0.08084056), (55, 0.16801219), (58, 0.07947531), (97, 0.04642806)]
[(8, 0.7273078), (69, 0.06939292), (78, 0.062151615), (101, 0.119957164)]

Second run produces:

[(5, 0.47463417), (8, 0.105600394), (12, 0.06531593), (55, 0.16066092), (58, 0.06662597), (97, 0.054465853)]
[(8, 0.7306167), (69, 0.054978732), (78, 0.06831972), (84, 0.025588958), (101, 0.10244013)]

Third:

[(5, 0.4771855), (8, 0.09988891), (12, 0.088423), (55, 0.15682992), (58, 0.058175407), (97, 0.053951494)]
[(8, 0.75193375), (69, 0.059308972), (78, 0.0622621), (84, 0.020040851), (101, 0.09659243)]

And so on...

For reproducibility you must specify a random seed in your LDA model. In this way, using the same seed always checked the same results. — Massifox, Sep 29 '19 at 08:18
How is `doc_topics` created? What's `type(doc_topics)`? Are you sure no other code is being run between two runs of your code? What if you try `print(r); print(r)` instead of one print, or if you repeat your code twice inside a single cell? (You may want to expand your question with these details, for more formatting control, rather than answering in a comment.) — gojomo, Sep 30 '19 at 03:57

score 0 · Answer 1 · answered Sep 29 '19 at 08:03

0

Because in almost every ml algorithm there is a slight of randomness in bith training and inference steps.

This question has already been asked before so next time you can google it and find an answer quickly (:

LDA model generates different topics everytime i train on the same corpus

answered Sep 29 '19 at 08:03

Yoel Nisanov

984
7
16

Hi, let me explain - I am *not* reproducing the LDA, simply the final lines of code pasted above (ie. AFTER creating a document-topic dictionary, simply running through it, without regenerating it) – Dror M Sep 29 '19 at 12:12
You're not regenerating the document-topic dictionary every time? – Yoel Nisanov Sep 29 '19 at 13:08
No - simply iterating through the-already-generated dictionary – Dror M Sep 30 '19 at 16:55

score 0 · Answer 2 · answered Sep 29 '19 at 08:06

0

To achieve reproducibility, you need to specify the random_state argument to the LdaModel constructor:

https://radimrehurek.com/gensim/models/ldamodel.html

answered Sep 29 '19 at 08:06

NPE

486,780
108
951
1,012

Answer above - Hi, let me explain - I am not reproducing the LDA, simply the final lines of code pasted above (ie. AFTER creating a document-topic dictionary, simply running through it, without regenerating it) – Dror M Sep 29 '19 at 12:12

Why do different runs of the same iteration produce different results?

2 Answers2