2

I'm developing a function to find terms into a document. In parameter of my function, I give a HashSet of String. I browse the HashSet to analyze each string (with the Lucene Analyzer class) then I seek the analyzed string into the text with the PhraseQuery class to know if it exists into the document. In return of my function, there is a HashSet which contains only terms found into the document.

It works, but slowly because of I browse all the HashSet. Is there no way to give a collection of words to Lucene, then get a collection with only the words that the document gets?

Chris Mantle
  • 6,595
  • 3
  • 34
  • 48
taubhi
  • 21
  • 1
  • Wow! I was just asking almost exactly the same question: "Let's say I have 100 (possibly multi-word) strings and I want to ask Lucene which of these terms are present in a particular document. In other words, I want to get an intersection of query terms vs a document. Is it possible? Is it a valid use case for Lucene?" – Marcin May 19 '14 at 14:59
  • 3
    I guess this question was already asked and answered here: http://stackoverflow.com/questions/7896183/get-matched-terms-from-lucene-query – Marcin May 20 '14 at 07:41
  • Thank you very much, I didn't find this question ! It brought me to find this other good answer : http://stackoverflow.com/questions/2851473/lucene-get-matched-terms-in-query Thanks again ! – taubhi May 20 '14 at 08:36

0 Answers0