0

I wonder whether I can get all the derived forms of a given word

for example, given the word "good", I get "goodness" and "advantage" etc.

In particular, get the related nouns of an "adjective"

Thanks

Jin
  • 1,203
  • 4
  • 20
  • 44
  • 1
    you could try crawling some webpages like leo.org – User Feb 11 '13 at 21:34
  • There is not such thing in Python, as User said, found these info on the web and crawl them. Here are some track you can follow:http://stackoverflow.com/a/419259/1216890 – lpostula Feb 11 '13 at 21:43

2 Answers2

0

It's going to be somewhat difficult to do what you're wanting for a few reasons, especially the problem of ambiguity. This can be alleviated somewhat if you choose the correct sense of the word, or otherwise do automatic word sense disambiguation (WSD).

As an example, for the word 'good', there are 27 senses in WordNet, and 37 unique lemmas (dictionary forms).

Here is a quick example using NLTK's implementation of WordNet.

>>> from nltk.corpus import wordnet
>>> good = wordnet.synsets('good')

>>> lemmas = set()
>>> for synset in good:
    for lemma in synset.lemmas:
        lemmas.add(lemma.name)
>>> lemmas
set(['beneficial', 'right', 'secure', 'just', 'unspoilt', 'respectable', 'good', 'goodness', 'dear', 'salutary', 'ripe', 'expert', 'skillful', 'in_force', 'proficient', 'unspoiled', 'dependable', 'soundly', 'honorable', 'full', 'undecomposed', 'safe', 'adept', 'upright', 'trade_good', 'sound', 'in_effect', 'practiced', 'effective', 'commodity', 'estimable', 'well', 'honest', 'near', 'skilful', 'thoroughly', 'serious'])
>>> len(lemmas)
37

>>> for synset in good:
    print synset
    print synset.lemmas
    print '-' * 79

Synset('good.n.01')
[Lemma('good.n.01.good')]
-------------------------------------------------------------------------------
Synset('good.n.02')
[Lemma('good.n.02.good'), Lemma('good.n.02.goodness')]
-------------------------------------------------------------------------------
Synset('good.n.03')
[Lemma('good.n.03.good'), Lemma('good.n.03.goodness')]
-------------------------------------------------------------------------------
Synset('commodity.n.01')
[Lemma('commodity.n.01.commodity'), Lemma('commodity.n.01.trade_good'), Lemma('commodity.n.01.good')]
-------------------------------------------------------------------------------
Synset('good.a.01')
[Lemma('good.a.01.good')]
-------------------------------------------------------------------------------
Synset('full.s.06')
[Lemma('full.s.06.full'), Lemma('full.s.06.good')]
-------------------------------------------------------------------------------
Synset('good.a.03')
[Lemma('good.a.03.good')]
-------------------------------------------------------------------------------
Synset('estimable.s.02')
[Lemma('estimable.s.02.estimable'), Lemma('estimable.s.02.good'), Lemma('estimable.s.02.honorable'), Lemma('estimable.s.02.respectable')]
-------------------------------------------------------------------------------
Synset('beneficial.s.01')
[Lemma('beneficial.s.01.beneficial'), Lemma('beneficial.s.01.good')]
-------------------------------------------------------------------------------
Synset('good.s.06')
[Lemma('good.s.06.good')]
-------------------------------------------------------------------------------
Synset('good.s.07')
[Lemma('good.s.07.good'), Lemma('good.s.07.just'), Lemma('good.s.07.upright')]
-------------------------------------------------------------------------------
Synset('adept.s.01')
[Lemma('adept.s.01.adept'), Lemma('adept.s.01.expert'), Lemma('adept.s.01.good'), Lemma('adept.s.01.practiced'), Lemma('adept.s.01.proficient'), Lemma('adept.s.01.skillful'), Lemma('adept.s.01.skilful')]
-------------------------------------------------------------------------------
Synset('good.s.09')
[Lemma('good.s.09.good')]
-------------------------------------------------------------------------------
Synset('dear.s.02')
[Lemma('dear.s.02.dear'), Lemma('dear.s.02.good'), Lemma('dear.s.02.near')]
-------------------------------------------------------------------------------
Synset('dependable.s.04')
[Lemma('dependable.s.04.dependable'), Lemma('dependable.s.04.good'), Lemma('dependable.s.04.safe'), Lemma('dependable.s.04.secure')]
-------------------------------------------------------------------------------
Synset('good.s.12')
[Lemma('good.s.12.good'), Lemma('good.s.12.right'), Lemma('good.s.12.ripe')]
-------------------------------------------------------------------------------
Synset('good.s.13')
[Lemma('good.s.13.good'), Lemma('good.s.13.well')]
-------------------------------------------------------------------------------
Synset('effective.s.04')
[Lemma('effective.s.04.effective'), Lemma('effective.s.04.good'), Lemma('effective.s.04.in_effect'), Lemma('effective.s.04.in_force')]
-------------------------------------------------------------------------------
Synset('good.s.15')
[Lemma('good.s.15.good')]
-------------------------------------------------------------------------------
Synset('good.s.16')
[Lemma('good.s.16.good'), Lemma('good.s.16.serious')]
-------------------------------------------------------------------------------
Synset('good.s.17')
[Lemma('good.s.17.good'), Lemma('good.s.17.sound')]
-------------------------------------------------------------------------------
Synset('good.s.18')
[Lemma('good.s.18.good'), Lemma('good.s.18.salutary')]
-------------------------------------------------------------------------------
Synset('good.s.19')
[Lemma('good.s.19.good'), Lemma('good.s.19.honest')]
-------------------------------------------------------------------------------
Synset('good.s.20')
[Lemma('good.s.20.good'), Lemma('good.s.20.undecomposed'), Lemma('good.s.20.unspoiled'), Lemma('good.s.20.unspoilt')]
-------------------------------------------------------------------------------
Synset('good.s.21')
[Lemma('good.s.21.good')]
-------------------------------------------------------------------------------
Synset('well.r.01')
[Lemma('well.r.01.well'), Lemma('well.r.01.good')]
-------------------------------------------------------------------------------
Synset('thoroughly.r.02')
[Lemma('thoroughly.r.02.thoroughly'), Lemma('thoroughly.r.02.soundly'), Lemma('thoroughly.r.02.good')]
-------------------------------------------------------------------------------
Wesley Baugh
  • 3,720
  • 4
  • 24
  • 42
-1

I would suggest looking at the WordNet corpus in NLTK. More on WordNet here.

owobeid
  • 173
  • 1
  • 9