1

I'm using Zend Lucene, but don't think the question is specific to that library.

Say I want to provide fulltext search for a database of books. Assume following models:

Model 1:

TABLE: book
- book_id
- name

TABLE: book_author
- book_author_id
- book_id
- author_id

TABLE: author
- author_id
- name

(a book can have 0 or more authors)

Model 2:

TABLE: book
- book_id
- name

TABLE: book_eav
- book_eav_id
- book_id
- attribute (e.g. "author")
- value (e.g. "Tom Clancy")

(a book can have 0 or more authors + information about publisher, number of pages, etc.)

What do I need to do in order to insert all the authors associated with a particular book in a document to be indexed? Do I put all the authors in one field in the document? Would I use some sort of delimiter to group author information? I'm looking for general strategies with this kind of data.

StackOverflowNewbie
  • 39,403
  • 111
  • 277
  • 441

1 Answers1

0

Put all the authors in one field in the document with a delimiter. So the document schema will be:

book_id
name
author: |author 1|author 2|...|author n|
other_attribute_1: |val 1|val 2|
other_attribute_2: |val 1|val 2|

With this schema you can search by author with different boosts with a query like:

(author:"|Tom Clancy|")^10 OR 
(author:"Tom Clancy")^5 OR 
(author:Tom Clancy)^1

This query will show the exact matches first, phrase matches then and finally other matches.

hkn
  • 1,453
  • 19
  • 21
  • Is there anything special about the delimiter you used? Can it be anything? Or does the vertical bar have special meaning in Lucene? – StackOverflowNewbie Nov 21 '11 at 14:50
  • Delimeter should not appear in your author string, it should not have any special meaning in Lucene. But you should choose it carefully otherwise analyzer can remove it. So you can use some random string like `sadwqewqadweqg` – hkn Nov 21 '11 at 14:52
  • Why bother putting in a delimiter? Why not just use a space to separate authors? Something like this would still work `author:Tom Clancy` or `author:"Tom Clancy"` right? – StackOverflowNewbie Nov 21 '11 at 20:59
  • It will work but you cannot search by exact author. Your first query will match a book with the following authors "Tom Blake" and "Jone Clancy". Your second query will match a book with the following author "Tom Clancy Jones" . – hkn Nov 22 '11 at 12:37