13

I have configured my Elastic Search implementation to facet the results by an id in the mapping and when I display that facet to the user I need to be able to show the human-readable name that represents it. The data I need is all present in the mapping, but I am not sure how I can return it as part of the facet. Surely it is possible?

Given the following example, I'd like to the facet to give me some way to correlate the thingId to thingName (or any other thing property that might be needed):

Mapping

{
  thingId,
  thingName
}

Facet Query

{
  "facets":{
    "things":{ "terms":{ "field":"thingId" } }
  }
}    

Result

{
  "hits":{
    "total":3,
    "max_score":1.0,
    "hits":[
      ...
    ]
  },
  "facets":{
    "things":{
      "_type":"terms",
      "missing":0,
      "total":3,
      "other":0,
      "terms":[
        {
          "term":"5",
          "count":1
        },
        {
          "term":"4",
          "count":1
        },
        {
          "term":"2",
          "count":1
        }
      ]
    }
  }
}

Edit

This answer regarding Solr suggests that I facet both properties (thingName and thingId) and then just loop over both facet result sets, assuming that the order of items would be the same. I don't know how reliable that would be, but it is an option.

Edit 2

This answer suggests that it's not possible to do what I want without combining the the two fields into a single value and faceting on that: thingId|thingName. Not ideal.

Edit 3

This answer suggests combining the values together into a single field and faceting on it (as above), but it uses a terms script to achieve the combination, thus not requiring me to index the combined form of the values. Still not perfect, but appears to be the least crappy option.

Community
  • 1
  • 1
Nathan Taylor
  • 24,423
  • 19
  • 99
  • 156
  • Combining `thingId` and `thingName` into a single string and splitting it at the application layer isn't out of the question, but I'm really hoping that there is a better solution. – Nathan Taylor Aug 30 '13 at 22:22
  • I think you need to consider if you really need to use the Id on the facet. Why don't you use the thingName? Have you thought about it? – Bruno Costa Aug 31 '13 at 00:19
  • Using the `thingName` is an option, but it is less ideal because I can't guarantee that `thingName` will be unique. – Nathan Taylor Aug 31 '13 at 00:28
  • Are you saying that you can have AAA with Id 1 and AAA with Id 2? In terms of faceting it should be only one filter, because the user see the value 'AAA'. With the separation of Id and Name, you will have two 'AAA' as an option for the user improves his search. What do you think? – Bruno Costa Aug 31 '13 at 00:32
  • @BrunoCosta While this would work, it fails to solve a bigger issue that I'd like to address which is that my 'thing' has a bunch of additional properties other than 'name' and 'id' which I may, at some point, need to use. The 'thing' object in question is actually indexed in full, and I'd like to be able to retrieve any part of it without another hit to the db or elastic. The SQL parallel here would be `SELECT Id, Name FROM Thing GROUP BY Id, Name`. – Nathan Taylor Aug 31 '13 at 00:46
  • Why are you using facet over an ID field. doesn't that mean you are going to get terms where every count is 1? Why don't you just get back the results themselves? Then you would have the whole object together as a result. Maybe I'm missing something, but facets make no sense to me if the counts are all 1. – ramseykhalaf Aug 31 '13 at 06:43
  • @ramseykhalaf Sorry if it was unclear, it is a relational id to another type. Not the record id. One ThingId to many records. – Nathan Taylor Sep 03 '13 at 17:19

1 Answers1

4

If you're not happy using the terms scripting, then another options would be to use nested aggregations, assuming you're able to use 1.0.0.

Your aggregation would then look something like this:

{
    "query": {
        "match_all": {}
    },
    "aggs": {
        "theIds": {
            "terms" : {
                "field": "thingId"
            },
            "aggs":{
                "theNames": {
                    "terms": {
                        "field": "thingName"
                    }
                }
            }
        }
    }
}

And the response would be something like:

"aggregations": {
      "theIds": {
         "buckets": [
            {
               "key": "1",
               "doc_count": 5,
               "theNames": {
                  "buckets": [
                     {
                        "key": "AAA",
                        "doc_count": 3
                     },
                     {
                        "key": "BBB",
                        "doc_count": 3
                     },
                     {
                        "key": "CCC",
                        "doc_count": 2
                     }
                  ]
               }
            },
            {
               "key": "2",
               "doc_count": 2,
               "theNames": {
                  "buckets": [
                     {
                        "key": "AAA",
                        "doc_count": 2
                     }
                  ]
               }
            }
         ]
      }
   }
Akshay
  • 3,361
  • 1
  • 21
  • 19