0

I want to find the base-form for input words in python

something like

get_base_form({running, best, eyes, moody})
--> run, good, eye, mood

A solution, that just deals with regular forms would be fine. But an answer, that also deals with irregular would be perfect.

If there is no library that does this, a web-service would be fine, too.

wotanii
  • 2,470
  • 20
  • 38
  • Did you [search](https://stackoverflow.com/questions/38763007/how-to-use-spacy-lemmatizer-to-get-a-word-into-basic-form)? – php_nub_qq Jul 26 '18 at 20:35
  • Please read and follow the posting guidelines in the help documentation, as suggested when you created this account. [On topic](http://stackoverflow.com/help/on-topic), [how to ask](http://stackoverflow.com/help/how-to-ask), and [... the perfect question](https://codeblog.jonskeet.uk/2010/08/29/writing-the-perfect-question/) apply here. StackOverflow is not a design, coding, research, or tutorial service. – Prune Jul 26 '18 at 20:40
  • @php_nub_qq yes. "spacy" could work, but I don't think it's the only solution – wotanii Jul 28 '18 at 05:36
  • google for "lemmatizing", not "stemming" - stems are not base forms of words - they are often not even words, but lemmas are. They depend on the Part Of Speech Tag of your word. For python, you can research spacy or nltk. – Suzana Apr 23 '20 at 20:08

1 Answers1

2

Use SnowballStemmer from nltk natural language toolkit:

from nltk.stem.snowball import SnowballStemmer

stemmer = SnowballStemmer("english")
print(stemmer.stem("generalized"))
print(stemmer.stem("generalization"))

output:

general
general

by the way, you can read nltk's documation @ https://www.nltk.org/

Massoud
  • 503
  • 4
  • 13