2

I have successfully retrieved synsets connected to a base synset via other semantic relations, as follows:

 wn.synset('good.a.01').also_sees()
 Out[63]: 
 [Synset('best.a.01'),
 Synset('better.a.01'),
 Synset('favorable.a.01'),
 Synset('good.a.03'),
 Synset('obedient.a.01'),
 Synset('respectable.a.01')]

wn.synset('good.a.01').similar_tos()
Out[64]: 
[Synset('bang-up.s.01'),
 Synset('good_enough.s.01'),
 Synset('goodish.s.01'),
 Synset('hot.s.15'),
 Synset('redeeming.s.02'),
 Synset('satisfactory.s.02'),
 Synset('solid.s.01'),
 Synset('superb.s.02'),
 Synset('well-behaved.s.01')]

However, the antonym relation seems different. I managed to retrieve the lemma connected to my base synset, but was not able to retrieve the actual synset, like so:

wn.synset('good.a.01').lemmas()[0].antonyms()
Out[67]: [Lemma('bad.a.01.bad')]

How can I get the synset, and not the lemma, that is connected via antonymy to my base synset - wn.synset('good.a.01') ? TIA

alvas
  • 115,346
  • 109
  • 446
  • 738
modarwish
  • 495
  • 10
  • 22
  • 1
    It's tricky cos antonymy relationships are linked through lemmas not synsets to synsets. – alvas Dec 05 '16 at 08:48
  • 1
    See that on http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=good&i=8&h=00001000000000000000000000000000#c , the **S** means Synset and **W** means word (i.e. lemma) – alvas Dec 05 '16 at 08:53
  • Hi Alvas! I was actually trying to get your email but could not find it.. how can I contact you? I recall you have helped most on all my Wordnet questions here :) – modarwish Dec 05 '16 at 08:56
  • From the `Lemma` object you should be able to do `x = Lemma('bad.a.01.bad'); x.synset()` – alvas Dec 05 '16 at 09:17

1 Answers1

1

For some reason, WordNet indexes antonymy relations at the Lemma level instead of the Synset (see http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=good&i=8&h=00001000000000000000000000000000#c), so the question is whether Synsets and Lemmas have many-to-many or one-to-one relations.


In the case of ambiguous words, one word many meaning, we have a one-to-many relation between String-to-Synset, e.g.

>>> wn.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]

In the case of one meaning/concept, multiple representation, we have a one-to-many relation between Synset-to-String (where String refers to Lemma names):

>>> dog = wn.synset('dog.n.1')
>>> dog.definition()
u'a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds'
>>> dog.lemma_names()
[u'dog', u'domestic_dog', u'Canis_familiaris']

Note: up till now, we are comparing the relationships between String and Synsets not Lemmas and Synsets.


The "cute" thing is that Lemma and String has a one-to-one relationship:

>>> wn.synsets('dog')
[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]
>>> wn.synsets('dog')[0]
Synset('dog.n.01')
>>> wn.synsets('dog')[0].definition()
u'a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds'
>>> wn.synsets('dog')[0].lemmas()
[Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), Lemma('dog.n.01.Canis_familiaris')]
>>> wn.synsets('dog')[0].lemmas()[0]
Lemma('dog.n.01.dog')
>>> wn.synsets('dog')[0].lemmas()[0].name()
u'dog'

The _name property of a Lemma object returns a unicode string, not a list. From the code points: https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L202 and https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L444

And it seems like the Lemma has a one-to-one relation with Synset. From docstring at https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L220:

Lemma attributes, accessible via methods with the same name::

  • name: The canonical name of this lemma.
  • synset: The synset that this lemma belongs to.
  • syntactic_marker: For adjectives, the WordNet string identifying the syntactic position relative modified noun. See: http://wordnet.princeton.edu/man/wninput.5WN.html#sect10 For all other parts of speech, this attribute is None.
  • count: The frequency of this lemma in wordnet.

So we can do this and somehow know that each Lemma object is only going to return us 1 synset:

>>> wn.synsets('dog')[0].lemmas()[0]
Lemma('dog.n.01.dog')
>>> wn.synsets('dog')[0].lemmas()[0].synset()
Synset('dog.n.01')

Assuming that you are trying to do some sentiment analysis and you need the antonyms of every adjective in WordNet, you can easily do this to accept the Synsets of the antonyms:

>>> from nltk.corpus import wordnet as wn
>>> all_adj_in_wn = wn.all_synsets(pos='a')
>>> def get_antonyms(ss):
...     return set(chain(*[[a.synset() for a in l.antonyms()] for l in ss.lemmas()]))
...
>>> for ss in all_adj_in_wn:
...     print ss, ':', get_antonyms(ss)
... 
Synset('unable.a.01') : set([Synset('unable.a.01')])
alvas
  • 115,346
  • 109
  • 446
  • 738
  • 1
    Errr, make a portal with your [sling ring](http://marvel.wikia.com/wiki/Sling_Ring) ;P? – alvas Dec 05 '16 at 09:50