5

In Solr I can use the query function query to return a numerical score for a query and I can user that in the context of a bf parameter something like bf=product(query('cat'),query('dog')) to multiply two relevance scores together.

Elasticsearch has search API that is generally more flexible to work with, but I can't figure out how I would accomplish the same feat. I can use _score in a script_function of a function_query but I can only user the _score of the main query. How can I incorporate the score of another query? How can I multiply the scores together?

Doug T.
  • 64,223
  • 27
  • 138
  • 202
JnBrymn
  • 24,245
  • 28
  • 105
  • 147
  • 1
    even better you can name those queries and do anything you want with them in the context of Solr's query DSL, such as `catQuery={!edismax qf=title^10 text v=$q}` then refer to that query in a function query: `product($catQuery...)`. Disappointed Elasticsearch lacks this fairly powerful capability – Doug T. Jul 31 '15 at 22:00

2 Answers2

2

You could script a TF*IDF scoring function using a function_score query. Something like this (ignoring Lucene's query and length normalization):

"script": "tf = _index[field][term].tf(); idf = (1 + log ( _index.numDocs() / (_index[field][term].df() + 1))); return sqrt(tf) * pow(idf,2)"

You'd take the product of those function results for 'cat' and 'dog' and add them to your original query score.

Here's the full query gist.

Peter Dixon-Moses
  • 3,169
  • 14
  • 18
  • And there is a bevy of other index-level information you can use to test out and construct your own scoring script directly in the DSL: https://www.elastic.co/guide/en/elasticsearch/reference/1.6/modules-advanced-scripting.html – Peter Dixon-Moses Aug 03 '15 at 15:33
  • 1
    pretty awesome answer... but I would hop to avoid this type of low level detail. Especially if I'm thinking about a multiterm or multifield query. The `function_score`'s `function`s supports a `filter` section. Ideally we could also just have a `query` section. – JnBrymn Aug 03 '15 at 20:14
  • 1
    Or even simpler, add a `score_mode` argument to the `should` block of a `Bool Query`. (As of now, ES scores boolean queries like this: `must` clause score + sum-of(`should` clause scores)). What you're trying to do is `must` clause score + prod-of(`should` clause scores). – Peter Dixon-Moses Aug 06 '15 at 16:01
1

Alternately, if you've got something in that bf that's heavyweight enough you'd rather not run it across the entire set of matches, you could use rescore requests to modify the score of the top N ranked ORIGINAL QUERY results using subsequent scoring passes with your (cat, dog, etc...) scoring-queries.

Peter Dixon-Moses
  • 3,169
  • 14
  • 18
  • why did you answer the same question twice? If you wish to present an alternative solution you can add it into the same answer – eliasah Aug 03 '15 at 15:47
  • 1
    I like having two different answers presented as two different answers so that the upvotes can differentiate between them – JnBrymn Aug 06 '15 at 17:04