Suppose I have two synsets synset(car.n.01') and synset('bank.n.01') and If I want to find the distance between these two synset in wordnet hierarchy then How can I do it using nltk?
I searched on internet but I am getting similarity algorithms like lin,resnik,jcn etc which are not solution for my question.
Please help me to solve this problem.

- 435
- 2
- 9
- 26
-
Why is `resnik similarity` not a solution to your problem? Also, is the distance the number of sysnsets that need to be crossed to reach from one sense to the other? – axiom Feb 26 '14 at 04:49
-
resnik works for ('v','v') and ('n','n') pair and it gives similarity score and I want distance between synsets and distance may be either no of synsets crossed to reach or in some other terms that you can suggest. – Madhusudan Feb 26 '14 at 04:54
-
do note the gotcha: http://stackoverflow.com/questions/20075335/is-wordnet-path-similarity-commutative/20799567#20799567 – alvas Feb 26 '14 at 08:40
2 Answers
From this
Path similarity, wup_similarity and lch_similarity, all of these should work since they are based on the distance between two synsets in the Wordnet hierarchy.
dog = wn.synset('dog.n.01')
cat = wn.synset('cat.n.01')
dog.path_similarity(cat)
dog.lch_similarity(cat)
dog.wup_similarity(cat)
From the same link, (relevant portions in bold)
synset1.path_similarity(synset2):
Return a score denoting how similar two word senses are, based on the shortest path that connects the senses in the is-a (hypernym/hypnoym) taxonomy. The score is in the range 0 to 1, except in those cases where a path cannot be found (will only be true for verbs as there are many distinct verb taxonomies), in which case -1 is returned. A score of 1 represents identity i.e. comparing a sense with itself will return 1.
synset1.lch_similarity(synset2), Leacock-Chodorow Similarity:
Return a score denoting how similar two word senses are, based on the shortest path that connects the senses (as above) and the maximum depth of the taxonomy in which the senses occur. The relationship is given as -log(p/2d) where p is the shortest path length and d the taxonomy depth.
synset1.wup_similarity(synset2), Wu-Palmer Similarity:
Return a score denoting how similar two word senses are, based on the depth of the two senses in the taxonomy and that of their Least Common Subsumer (most specific ancestor node). Note that at this time the scores given do not always agree with those given by Pedersen's Perl implementation of Wordnet Similarity.
-
thanks axiom for your reply. But the problem here is these similarity algorithms are applicalible for noun-noun and verb-verb pairs only. And I want to use concept of distance between synset to chek wether word is polysemy or not by finding distance between synsets of the same word like suppose word bank have 12 synsets and some of them are related to 'financial sector' and some are related to 'river bank'. I want to find distance amoung various combinations of synsets of same word so that if a single score is below threshold then it will be considered as polysemy word. – Madhusudan Feb 26 '14 at 05:15
-
1@Madhusudan Wu-Palmer would do that for you. Compare these: `wn.wup_similarity(wn.synset('bank.n.01'), wn.synset('bank.n.03'))` is 0.61 and `wn.wup_similarity(wn.synset('bank.n.01'), wn.synset('bank.n.04'))` is 0.16 – Mehdi Jul 04 '15 at 10:20
in addition you could have a look on the chatterbot implementation.
you will find more distance processing in that file

- 881
- 6
- 17