I want to get initial form of natural English words, e.g.:
'words' -> 'word'
'Jhon' -> 'John'
'openning' -> 'open'
I have tried python Stemer lib:
st=Stemer.Stemer()
for w in ('very', 'words', 'openning'):
print st.stemWord(w),
>>>veri word open
i expect 'very' but instead got 'veri'
then nltk.corpus.wordnet lib:
from nltk.corpus import wordnet
wordnet.synsets( 'beans' )
[Synset('bean.n.01'),
>>>Synset('bean.n.02'),
>>>Synset('bean.n.03'),
>>>Synset('attic.n.03'),
>>>Synset('bean.v.01')]
it give more info but not a quick dictionary.
LancasterStemmer can not get 'english' as 'english':
from nltk.stem.lancaster import LancasterStemmer
st = LancasterStemmer()
st.stem('english')
>>>>'engl'
enchant lib method check() and sugguest() is not suitable:
>>> import enchant
>>> d = enchant.Dict("en_US")
>>> d.check("Hello")
Any method to get quick original form, for a document text?