3

I am trying to do morph analysis in POS tagging.

Is there any tool (which I can call from within a python or java script) which returns the Root form and its suffix , when we call it by passing an English word as parameter.

For example:

if I give input:'liked' , I want to get output:like,ed

To get the root form for a given english word, I tried to use porter stemmer and snowball stemmer (inside a python script) but It does not give the valid root word always, since it just strips off the suffix.

from nltk.stem.porter import *
porter_stemmer = PorterStemmer()
print(porter_stemmer.stem("ladies"))
print(porter_stemmer.stem("went"))

output

ladi   
went

for example: I gave input as 'ladies' but it return 'ladi' as root form , which is not even an English word.

Sometimes stemmers just return the input word as it is. for example, I gave input 'went' and these stemmers return 'went' as root form instead of 'go'.

Please suggest which tool I can use, to get the root form and suffix.

  • 1
    You don't want a "stemmer", you want morphological analysis. See e.g. http://stackoverflow.com/questions/17317418/stemmers-vs-lemmatizers – tripleee Sep 17 '14 at 18:56

1 Answers1

4
from nltk.stem.wordnet import WordNetLemmatizer
WNL = WordNetLemmatizer()
WNL.lemmatize('ladies')
WNL.lemmatize('went')

(I'm trying to find something else to say here, but I think that code is self-explanatory?)

Darren Cook
  • 27,837
  • 13
  • 117
  • 217