4

I read this question (Coherence score 0.4 is good or bad?) and found that the coherence score (u_mass) is from -14 to 14. But when I did my experiments, I got a score of -18 for u_mass and 0.67 for c_v. I wonder how is my u_mass score out of range (-14, 14)?

Update: I used gensim library and scanned the numbers of topics from 2 to 50. For u_mass, it starts from 0 to the lowest negative point and turn back a bit, like an upsidedown version of c_v.

Dammio
  • 911
  • 1
  • 7
  • 15

4 Answers4

4

I refered two sources and found the similarity and may be clear my doubt: https://www.os3.nl/_media/2017-2018/courses/rp2/p76_report.pdf

https://amp.reddit.com/r/learnmachinelearning/comments/9bcr77/coherence_score_u_mass/

I believe that for u_mass, the graph will have an inverse tendency upsidedown compared to c_v, the lowest negative point is the best. Of course, if you use gensim.

Here is the figure for training the number of topics from 2 to 50

Dammio
  • 911
  • 1
  • 7
  • 15
  • I do not think it is just the upside down, have a look at [Differences among Topic Coherence Metrics ("u_mass", "c_v", ...) & Choose the Best Threshold Value to Filter Out "Low-quality" Documents](https://groups.google.com/g/gensim/c/CsscFah0Ax8) where the two have their own curves, not just mirroring the other. – questionto42 Jul 13 '21 at 22:29
  • @questionto42 It depends on the dataset but for clear, we have this tendency. For Umass, the graph will start from one of the top points and then go down gradually, of course we do have some curves somewhere before it gradually goes up again. The same thing will happen with Uv but in another side. – Dammio Sep 02 '21 at 02:18
  • Agreed, the two curves are very roughly upside down, not mirrored of course, but as a very rough trend, one can see that in the link (pressing on the `...` to see the images). This was meant by you from the start, sorry for bothering. – questionto42 Sep 02 '21 at 06:29
  • 1
    Just to make the wording a bit clearer: "the lowest negative point is the best" for u_mass, while the highest coherence is reached with the highest c_v. – questionto42 Sep 14 '21 at 14:14
  • 2
    I found this and it says otherwise. It looks like a reliable source of information as it is one of the developers of Gensim: https://github.com/RaRe-Technologies/gensim/pull/710#issuecomment-425344644 – Tito Sanz Oct 27 '21 at 12:27
  • @Tito Sanz: Can you try any experiment to see the model graph? In my case (the figure), it is weird if K=2 is the best? – Dammio Feb 04 '22 at 15:08
4

Following what is stated here (pg 13-14), which is the same document mentioned by @Dammio in his answer the interpretation is the opposite. In the text, it says: "According to the UMASS coherence measurements, the coherence of the topics globally decreases when K increases." K is the number of topics. They continue saying: "For the analysis, we compare models with K = 6 for 40 iterations which is a local minimum, and for 10 iterations which performed better." In the figure, it can be clearly seen that it compares the local minimum which is worse with the local maximum which is more coherent. This means exactly the opposite of what is stated in the accepted answer. Besides, I found in a Github post saying exactly the same: higher values are better:Link to Github answer

Figure 4 in pdf document

Tito Sanz
  • 1,280
  • 1
  • 16
  • 33
1

According to the mathematical formula for the u_mass coherence score provided in the original paper.

If u_mass closer to value 0 means perfect coherence and it fluctuates either side of value 0 depends upon the number of topics chosen and kind of data used to perform topic clustering. The best way to judge u_mass is to plot curve between u_mass and different values of K (number of topics). Choose K with the value of u_mass close to 0.

You can refer to this link which provides python code snippet to plot curve between different values of K and c_v. Here you can replace c_v with u_mass coherence metric.

I hope this explanation helps.

Piyush
  • 91
  • 5
  • I use gensim and my graph is not like you said. It begins from 0 to the lowest point and then turn up to back a bit and fluctuate that point. It looks like a up side down version of c_v. I doubt that gensim do something? – Dammio May 27 '20 at 07:08
  • Can I see your graph here? – Piyush May 27 '20 at 11:49
  • Here it is: http://www.mediafire.com/view/nkq9qw9l55om031/Figure_1.png – Dammio May 27 '20 at 19:08
0

I think the discussion would benefit from a picture showing the convergence of both coherence measures during successful training. Here is an example from my own research: enter image description here

The full code is in this Kaggle notebook.

Maciej Skorski
  • 2,303
  • 6
  • 14