0

I'm using ES v0.90.1. I want to be able to boost a document of a specific type of my index using one of it's fields. As described in the official documentation I defined my mapping like that :

{
    "mappings": {
        "mytesttype": {
            "_boost": {
                "name": "doc_boost",
                "null_value": 1.0
            },
            "properties": {
                "date_start": {
                    "type": "date",
                    "format": "date_time"
                },
                "date_end": {
                    "type": "date",
                    "format": "date_time"
                }
            }
        }
    }
}

So, in my opinion, i'm saying that my index will have a type mytesttype that has a document boost field named doc_boost with a default value of 1.

Here's the index's meta after creation:

{

    state: open
    settings: {
        index.number_of_shards: 1
        index.number_of_replicas: 0
        index.version.created: 900199
    }
    mappings: {
        mytesttype: {
            _boost: {
                null_value: 1
                name: doc_boost
            }
            properties: {
                date_end: {
                    format: date_time
                    type: date
                }
                date_start: {
                    format: date_time
                    type: date
                }
                y: {
                    type: long
                }
                x: {
                    type: long
                }
            }
        }
    }
    aliases: [ ]

}

I then tried indexing two documents :

{
    "ref": "ref-1",
    "date_start": "2013-07-01T00:00:00.000+0000",
    "date_end": "2016-07-01T00:00:00.000+0000",
    "y": 100,
    "x": 100,
    "doc_boost": 1.0
}

{
    "ref": "ref-2",
    "date_start": "2013-07-01T00:00:00.000+0000",
    "date_end": "2016-07-01T00:00:00.000+0000",
    "y": 100,
    "x": 100,
    "doc_boost": 2.0
}

Those two documents are the same except for the doc_boost field and a ref value.

Now my goal is to do a simple request that would get both documents but having as high scored result the one with doc_boost = 2. So here's my request:

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "x": {
              "query": 100,
              "type": "boolean"
            }
          }
        },
        {
          "match": {
            "y": {
              "query": 100,
              "type": "boolean"
            }
          }
        },
        {
          "range": {
            "date_start": {
              "from": null,
              "to": "now",
              "include_lower": true,
              "include_upper": true
            }
          }
        },
        {
          "range": {
            "date_end": {
              "from": "now",
              "to": null,
              "include_lower": true,
              "include_upper": true
            }
          }
        }
      ]
    }
  }
}

I would expect to have a higher score on the ref-2 document but here's the response i'm getting, together with the explain output:

{

    took: 3
    timed_out: false
    _shards: {
        total: 1
        successful: 1
        failed: 0
    }
    hits: {
        total: 2
        max_score: 2
        hits: [
            {
                _shard: 0
                _node: 99cl3dO9TFecp3fDiR3e6A
                _index: test_elasticsearchtest
                _type: mytesttype
                _id: mkwrfEswSj-T5x0c5AObuw
                _score: 2
                _source: {
                    ref: ref-1
                    date_start: 2013-07-01T00:00:00.000+0000
                    date_end: 2016-07-01T00:00:00.000+0000
                    y: 100
                    x: 100
                    doc_boost: 1
                }
                _explanation: {
                    value: 2
                    description: sum of:
                    details: [
                        {
                            value: 0.5
                            description: ConstantScore(x:[100 TO 100]), product of:
                            details: [
                                {
                                    value: 1
                                    description: boost
                                }
                                {
                                    value: 0.5
                                    description: queryNorm
                                }
                            ]
                        }
                        {
                            value: 0.5
                            description: ConstantScore(y:[100 TO 100]), product of:
                            details: [
                                {
                                    value: 1
                                    description: boost
                                }
                                {
                                    value: 0.5
                                    description: queryNorm
                                }
                            ]
                        }
                        {
                            value: 0.5
                            description: ConstantScore(date_start:[* TO 1374063073249]), product of:
                            details: [
                                {
                                    value: 1
                                    description: boost
                                }
                                {
                                    value: 0.5
                                    description: queryNorm
                                }
                            ]
                        }
                        {
                            value: 0.5
                            description: ConstantScore(date_end:[1374063073249 TO *]), product of:
                            details: [
                                {
                                    value: 1
                                    description: boost
                                }
                                {
                                    value: 0.5
                                    description: queryNorm
                                }
                            ]
                        }
                    ]
                }
            }
            {
                _shard: 0
                _node: 99cl3dO9TFecp3fDiR3e6A
                _index: test_elasticsearchtest
                _type: mytesttype
                _id: uvtIJ3n2RTad6CHnzENHgA
                _score: 2
                _source: {
                    ref: ref-2
                    date_start: 2013-07-01T00:00:00.000+0000
                    date_end: 2016-07-01T00:00:00.000+0000
                    y: 100
                    x: 100
                    doc_boost: 2
                }
                _explanation: {
                    value: 2
                    description: sum of:
                    details: [
                        {
                            value: 0.5
                            description: ConstantScore(x:[100 TO 100]), product of:
                            details: [
                                {
                                    value: 1
                                    description: boost
                                }
                                {
                                    value: 0.5
                                    description: queryNorm
                                }
                            ]
                        }
                        {
                            value: 0.5
                            description: ConstantScore(y:[100 TO 100]), product of:
                            details: [
                                {
                                    value: 1
                                    description: boost
                                }
                                {
                                    value: 0.5
                                    description: queryNorm
                                }
                            ]
                        }
                        {
                            value: 0.5
                            description: ConstantScore(date_start:[* TO 1374063073249]), product of:
                            details: [
                                {
                                    value: 1
                                    description: boost
                                }
                                {
                                    value: 0.5
                                    description: queryNorm
                                }
                            ]
                        }
                        {
                            value: 0.5
                            description: ConstantScore(date_end:[1374063073249 TO *]), product of:
                            details: [
                                {
                                    value: 1
                                    description: boost
                                }
                                {
                                    value: 0.5
                                    description: queryNorm
                                }
                            ]
                        }
                    ]
                }
            }
        ]
    }

}

Both documents have the same score. Could someone explain to me what i did wrong ?

javanna
  • 59,145
  • 14
  • 144
  • 125
Crystark
  • 3,693
  • 5
  • 40
  • 61
  • 1
    Thanks for the detailed question. Do you mind adding the `explain=true` parameter to your search request and post the result? It should explain the reasoning behind those scores. – javanna Jul 17 '13 at 11:46
  • @javanna I added the result with explain activated. Didn't know this was possible. Thanks for pointing it out! – Crystark Jul 17 '13 at 12:17

1 Answers1

1

The problem here is that you are not executing any full-text search. As you can see from the explain output all of your queries map to range queries, which don't involve any scoring. In fact they just match or not, you can't say how much, can you? That's why you find ConstantScoreQuery in the explain output, and that's why the document boost is not taken into account.

Index time boosting (can be either at a document level or per field) is usually taken into account when a score needs to be computed in order to say how much a document matches a certain query. You would see the index time boosting factor in the field norm section of the explain output.

To work around the issue you're having I would suggest you not to use index time boosting. It's not flexible since it requires to reindex your documents in order to change it. I would rather use query time boosting. There are different queries available in elasticsearch that allow you to modify the score, have a look at this other question to know more.

You can still rely on the doc_boost field in your documents if you want, and that would mean that you'd still have to reindex your docs in order to change that value. You just need to remove the _boost fragment from your mapping since you are going to be applying the boost factor at query time. You can then wrap your query into a custom score query and use a script to modify the score, for instance multiplying it by doc_boost.

"custom_score" : {
    "query" : {
        ....
    },
    "script" : "_score * doc['doc_boost'].value"
}
Community
  • 1
  • 1
javanna
  • 59,145
  • 14
  • 144
  • 125
  • 1
    Thanks for the explanation. I really had no clue you needed a full-text search to have the document boost taken into account. I think i misunderstood what score was: I thought it was more a rank than a actual score towards a goal (perfect query match). Makes more sens now. And the custom_score solution works like a charm. I think that's what we'll be using for now as we want to manage priorities with this. Thanks again. – Crystark Jul 17 '13 at 14:34