I'm trying to analyze the texts in Italian in R. As you do in a textual analysis I have eliminated all the punctuation, special characters and Italian stopwords. But I have got a problem with Stemming: there is only one Italian stemmer (Snowball), but it is not very precise.
To do the stemming I used the tm
library and in particular the stemDocument
function and I also tried to use the SnowballC
library and both lead to the same result.
stemDocument(content(myCorpus[[1]]),language = "italian")
The problem is that the resulting stemming is not very precise. Are there other more precise Italian stemmers? or is there a way to implement the stemming, already present in the TM library, by adding new terms?