1

I am using Lucene Highlighter 2.4.1 for my application. I use the highlighter to get the best matching fragments, and display them. I make a call to a function String[] getFragmentsWithHighlightedTerms(Analyzer analyzer, Query query, String fieldName, String fieldContents, int fragmentsNumber, int fragmentSize). For example :

String text = doc.get("MetaData");
getFragmentsWithHighlightedTerms(analyzer, query, "MetaData", Text, 5, 100);

The function getFragmentsWithHighlightedTerms() is defined as follows

private static String[] getFragmentsWithHighlightedTerms( argument list here)
{
    TokenStream stream = TokenSources.getTokenStream(fieldName, fieldContents, analyzer);
    SpanScorer scorer = new SpanScorer(query, fieldName, new CachingTokenFilter(stream));
    Fragmenter fragmenter = new SimpleSpanFragmenter(scorer, fragmentSize);

    Highlighter highlighter = new Highlighter(scorer);
    highlighter.setTextFragmenter(fragmenter);
    highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);

    String[] fragments = highlighter.getBestFragments(stream, fieldContents, fragmentNumber);

    return fragments;
}

Now my trouble is that the highlighter.getBestFragments() method is returning duplicates. i.e, If i display say the first 5 fragments, no. 1 and 3 are same. I do not quite understand what is causing this. Is there a problem with the code?

Kevin Reid
  • 37,492
  • 13
  • 80
  • 108
mksh15
  • 97
  • 8
  • Does the duplicate fragment actually occur in the field content multiple times? Can you post the example query and content? – KenE Jun 08 '10 at 20:30
  • 1
    Hi, thanks for your reply. I found the bug, which was in Index creation that was causing duplicate hits. – mksh15 Jun 14 '10 at 08:16

1 Answers1

-1

I dont have the code in front of me, but I think you are getting an array of arrays. So you would need to do this:

item[] = fragments[0]
fragment = item[0]

or just get 1 item out the fragments array.

axel22
  • 32,045
  • 9
  • 125
  • 137