1

I have different types of documents each of which may have multiple authors and upon searching I would like:

  1. results to be grouped by author such that I can count the number of documents of each type by each author and
  2. use the highlighter to highlight the documents belonging to a selected author.

How should I index the documents and search on them to achieve this? Particularly, how to perform grouping when I have multiple authors for a document and the documents are of different types?

PSK
  • 347
  • 2
  • 13
  • 1
    You may already have seen these, but just in case: (1) You can see some sample grouping code [here](https://lucene.apache.org/core/8_8_0/grouping/org/apache/lucene/search/grouping/package-summary.html#package.description). (2) You can see some sample highlighting code [here](https://lucene.apache.org/core/8_8_0/highlighter/org/apache/lucene/search/highlight/package-summary.html#package.description). If you have trouble with either of those, you can describe the specific problem - but those should be good starting points. I have used (2) with no problem. I have not used (1). – andrewJames Feb 02 '21 at 13:44
  • Thanks for the resources @andrewjames, I have looked at them both. (2) works well for me as well and I have no issues with it. In case of (1) it is stated that, "This module enables search result grouping with Lucene, where hits with the same value in the specified single-valued group field are grouped together. " but my documents may have multiple authors making the author field multi-valued. Is that right? Does that mean I can't use (1) for my use case? I can only think of indexing the same document as many number of times as the number of authors for that document which seems like a waste. – PSK Feb 03 '21 at 03:43
  • I also found [this](https://stackoverflow.com/questions/8550818/whats-the-difference-between-grouping-and-facet-in-lucene-3-5) useful. I couldn't wrap my head around grouping so I chose to go ahead with faceting and then searching again with the same query but with an additional filter for author to perform the highlighting. [This](https://github.com/svn2github/pylucene/blob/master/samples/FacetExample.py) is a good example of faceting in PyLucene. – PSK Feb 04 '21 at 12:54

0 Answers0