I am currently in the process of upgrading a search engine application from Lucene 3.5.0 to version 4.10.3. There have been some substantial API changes in version 4 that break backward compatibility. I have managed to fix most of them, but a few issues remain that I could use some help with:
- "cannot override final method from Analyzer"
The original code extended the Analyzer class and the overrode tokenStream(...).
@Override
public TokenStream tokenStream(String fieldName, Reader reader) {
CharStream charStream = CharReader.get(reader);
return
new LowerCaseFilter(version,
new SeparationFilter(version,
new WhitespaceTokenizer(version,
new HTMLStripFilter(charStream))));
}
But this method is final now and I am not sure how to understand the following note from the change log:
ReusableAnalyzerBase has been renamed to Analyzer. All Analyzer implementations must now use Analyzer.TokenStreamComponents, rather than overriding .tokenStream() and .reusableTokenStream() (which are now final).
There is another problem in the method quoted above:
- "The method get(Reader) is undefined for the type CharReader"
There seem to have been some considerable changes here, too.
- "TermPositionVector cannot be resolved to a type"
This class is gone now in Lucene 4. Are there any simple fixes for this? From the change log:
The term vectors APIs (TermFreqVector, TermPositionVector, TermVectorMapper) have been removed in favor of the above flexible indexing APIs, presenting a single-document inverted index of the document from the term vectors.
Probably related to this:
- "The method getTermFreqVector(int, String) is undefined for the type IndexReader."
Both problems occur here, for instance:
TermPositionVector termVector = (TermPositionVector) reader.getTermFreqVector(...);
("reader" is of Type IndexReader)
I would appreciate any help with these issues.