How can I get the tokens (whether it be the list of tokens, TokenStream
, or something else) that were used for a Field
within a Document
from a lucene index? That is, is it possible to get the tokens that were used in tokens
(from the example) from the index? (I'm not wondering how to get tokens out of a TokenStream
)
doc.add(new Field("title", tokens))
In the documentation there's Field.tokenStreamValue()
but when I do doc.getFieldable(field_name)
that simply returns null
.
I've also tried (from the third comment in lucene - Fieldable.tokenStreamValue()):
TokenSources.getTokenStream(reader, doc_id, field_name)
but I get
java.lang.IllegalArgumentException: title in doc #630does not have any term position data stored
at org.apache.lucene.search.highlight.TokenSources.getTokenStream(TokenSources.java:256)