I'm using Lucene for my project and I need a custom Analyzer.
Code is:
public class MyCommentAnalyzer extends Analyzer {
@Override
protected TokenStreamComponents createComponents( String fieldName, Reader reader ) {
Tokenizer source = new StandardTokenizer( Version.LUCENE_48, reader );
TokenStream filter = new StandardFilter( Version.LUCENE_48, source );
filter = new StopFilter( Version.LUCENE_48, filter, StandardAnalyzer.STOP_WORDS_SET );
return new TokenStreamComponents( source, filter );
}
}
I've built it, but now I can't go on. My needs is that the filter must select only certain words. Like an opposite process compared to use stopwords: don't remove from a wordlist, but add only the terms in the wordlist. Like a prebuilt dictionary. So the StopFilter doesn't fill the target. And none of the filters Lucene provides seems good. I think I need to write my own filter, but don't know how.
Any suggestion?