1

how do i have to index my data and configure solr and my search options in solr, that an autocompletion (like google) with the following requirements is possible:

Products: - We have products with their titles, descriptions, id's, e.g. for the title: toshiba tecra s1: centrino 1.5 ghz/xp pro/15.0" tft/40 gb/256 mb+256mb/cd-rw-dvd-rom/lan/wi-fi - this products or fields of this product has to be indexed in such a way that the following should be possible (no differentation how a user search for the searchterm, e.g. TOSHIBA or tOSHiba) - if a user starts entering the first three characters "tos" max. 20 results (the complete title (phrase) e.g. "toshiba tecra s1: centrino 1.5 ghz/xp pro/15.0" tft/40 gb/256 mb+256mb/cd-rw-dvd-rom/lan/wi-fi") should appear in the autocomplete box. - if a user enters e.g. two terms "toshiba tecra" the searchresult must be more precisly and just all documents should be shown, that contain the (coherent) terms "toshiba tecra"

It would be great to get any hints for this, what kind of tokenizer/searchcomponent etc. to use.

I'm using solr Version 3.5

Thank you for oyur thoughts Ramo

ramo
  • 353
  • 6
  • 22

1 Answers1

5

Solr 3.X has an inbuilt Suggester component, which allows you to build suggestion on limited fields.

The following links provide the implementation details -
1. http://lucidworks.lucidimagination.com/display/solr/Suggester
2. http://solr.pl/en/2010/11/15/solr-and-autocomplete-part-2/

For alternate approaches you can check EdgeNGrams implementation or Terms Component.

Jayendra
  • 52,349
  • 4
  • 80
  • 90
  • Hi, i thought the Suggester is just on terms basis. i tried the second link to implement, but somehow i didn't get the result i've described above. i don't know if it is possible to search for a coherent term.... – ramo Dec 11 '11 at 15:45
  • Its on terms, however it would be decided who you index the terms. You can use keyword tokenizer so that the titles are not split into tokens in addition with lower case filters, ascii filters to make the autocompletion case and language independent. – Jayendra Dec 11 '11 at 17:43
  • okay, i've used the keywordtokenizerfactory and lowercasefilterfactory for indexing and searching. that now works good. the next steps would be to make optimize solr for search performance. thank you Jayendra – ramo Dec 11 '11 at 20:15