0

Given the following 4 objects in an elasticsearch index:

"hits": [
  {
    "_id": "0:0",
    "_source": {
      "id": 0,
      "version": 0,
      "published": true
    }
  },
  {
    "_id": "0:1",
    "_source": {
      "id": 0,
      "version": 1,
      "published": false,
      "latest": true
    }
  },
  {
    "_id": "1:0",
    "_source": {
      "id": 1,
      "version": 0,
      "published": true
    }
  },
  {
    "_id": "1:1",
    "_source": {
      "id": 1,
      "version": 1,
      "published": true,
      "latest": true
    }
  }
]

I would like to find the documents using these rules:

  • with published:true
  • no duplicate id
  • for documents with the same id the highest version should be returned.

So for the above I'd like to get 0:0 and 1:1:

"hits": [
  {
    "_id": "0:0",
    "_source": {
      "id": 0,
      "version": 0,
      "published": true
    }
  },
  {
    "_id": "1:1",
    "_source": {
      "id": 1,
      "version": 1,
      "published": true,
      "latest": true
    }
  }
]

I'm aware that I can use top_hits, but I'd like to know if this is possible without it, such that the main hits.hits array will contain these results.

I'd probably do the collapsing as follows:

{ 
  query  : {...},
  aggs : {
    ids: {
      terms: {
          field: "id"
      },
      aggs:{
          dedup:{
            top_hits:{ size:1, sort: {version : 'desc'} }
          }
        }    
    }
  }
}

The reason I'm hoping to avoid using top_hits is that I'll need to update the result parser in our application. Also the size field will not work correctly if I do so.

ed.
  • 2,696
  • 3
  • 22
  • 25

1 Answers1

0

To answer my own question based on this answer, it's not possible without using the top_hits aggregation. I think what I was trying to achieve wasn't the best use of aggregation. Instead I'm going to adjust the index model by adding latestPublished true to the relevant models, allowing the query to be { term: { latestPublished: true}}.

Community
  • 1
  • 1
ed.
  • 2,696
  • 3
  • 22
  • 25