10

In a SOLR install, when I search against a field with a multi-word search term I want SOLR to return documents that have all of the terms in the search, but they do not need to be in the exact order.

For example, if I search for title of Brown Chicken Brown Cow, I want to find all documents that contain all of the terms Brown, Chicken and Cow, irrespective of order in the title field. So, for example, the title "The chicken and the cow have brown poop" should match the query. AFAIK, this is how Google executes searches as well.

I have experimented with the following query formats:

1. Title:Brown AND Title:Chicken
2. Title:Brown AND Chicken
3. Title:Brown+Chicken

I am very confused by the results. In some instances, the first two queries return the same exact set of results. In other instances, the first version will return many results and the second version will return none. The third version seems to meet my needs, but I am confused by the different meaning of the queries.

All of my tests have been run against a field of type text_en.

<field name="Title" multiValued="false" type="text_en" indexed="true" stored="true"/>

So, what's the best SOLR query/set up for this type of search? Also, is there an easy way to make Solr.NET take a user entered search term and convert it to this type of format?

Also, will SOLR by default give documents that match the order of the search phrase a higher relevancy score? If not, what's the right levers to pull to make that happen?

Edit: Some of my confusion was caused by searching against not default fields vs default fields. Knowing this, the only format that works consistently is the first format.

user229044
  • 232,980
  • 40
  • 330
  • 338
jmacinnes
  • 1,589
  • 1
  • 11
  • 21

2 Answers2

10

If I were you I would try to use:

Title:(Brown Chicken)

Brackets will make it equivalent to your query no 1. Quotation will force Solr to search for exact match, including space and order

Fuxi
  • 5,298
  • 3
  • 25
  • 35
3

Please try Title:"Brown Chicken" or use Dismax query parser to handle your queries.


The wiki for lucene query parser speaks (emphasis mine):

....Since text is the default field, the field indicator is not required.

Note: The field is only valid for the term that it directly precedes, so the query

title:Do it right

Will only find "Do" in the title field. It will find "it" and "right" in the default field (in this case the text field).

Do you have only the title field in your data model?

Please run debugQuery=on to explain your query to see how they are scored: see it in action https://stackoverflow.com/a/9262300/604511

Community
  • 1
  • 1
Jesvin Jose
  • 22,498
  • 32
  • 109
  • 202
  • No, there are multiple fields in my documents. Title is the default field, so that explains some of the weirdness I was seeing. Using quotes doesn't work; it enforces term order. With this information, it seems like the only way to accomplish what I want is this format: Title:Brown AND Title:Chicken. Unfortunately, little trickier to parse a keyword into that form. I'll look into Dismax. – jmacinnes Feb 14 '12 at 13:26
  • "it enforces term order" should have known :-/ – Jesvin Jose Feb 14 '12 at 16:26