-3

I have a simple project that needs to achieve this kind of things.

Sports - > Sport
Walking -> Walk

and ideally also do things like:

good -> better
better -> good 
person -> people
people -> person

Could someone point me to the most light-way library that can achieve this? (I know there is lib like Lucene, CoreNLP etc..) but these are quite HEAVY and I really just need a stemmer / lemmatiser

Thank you!

Johny19
  • 5,364
  • 14
  • 61
  • 99

1 Answers1

2

If you are ok with coarse results (like updates -> updat) and the weight is crucial, use stemming. Take a look at question devoted to stemming and providing several options: Snowball, Mg4j and others. There is also WordNet stemmer as a part of JWI.

If you want more sophisticated results, you have to deal with lemmatization, which also has several libs: Stanford CoreNLP (it is not so complicated really) or CICWN based on WordNet

Community
  • 1
  • 1
Nikita Astrakhantsev
  • 4,701
  • 1
  • 15
  • 26