11

I am looking for a good stemmer for Hebrew - I found nothing at all using Google...

On the HebMorph site it says that:

Stem and Lemma originally have different meanings, but for Semitic languages they seem to be used interchangeably.

Does that mean that for NLP purposes, I could use lemmas instead of stems? Keeping in mind that: Stemmers are much simpler, smaller and usually faster then lemmatizers, and for many applications their results are good enough. Using a lemmatizer for that is a waste of resources. (source )

Thank you.

Community
  • 1
  • 1
Cheshie
  • 2,777
  • 6
  • 32
  • 51
  • possible duplicate of [Lucene Hebrew analyzer](http://stackoverflow.com/questions/1063856/lucene-hebrew-analyzer) – Chiron Jan 06 '14 at 15:42
  • I don't know how you didn't find anything in Google. http://wiki.apache.org/solr/LanguageAnalysis#Hebrew and https://code.google.com/p/hebstem/ and https://github.com/synhershko/HebMorph – Chiron Jan 06 '14 at 15:43
  • Yeah I also saw that 'hebstem' site, but I couldn't find anything downloadable there. And with HebMorph - I didn't see anything about stemming. It's there that I saw that they use the terms 'lemma' and 'stem' interchangeably. I'm now checking the SOLR page, I didn't see that one before. Thanks. – Cheshie Jan 06 '14 at 17:15

1 Answers1

4

In Hebrew both stemmer and lemmatizer are complex - you cannot just trim letters from the word according to ending of a word like in porter stemmer...

Regarding an existing implementation of a lemmatizer you can try http://hebrew-nlp.co.il currently in beta and it is free

Roeya
  • 111
  • 7