1

For a doc indexed at Elasticsearch like:

{
  "a": [
    {
      "b": [1,2,3],
      "c": "abcd1"
    },
    {
      "b": [4,5,6,7],
      "c": "abcd2"
    }
  ]
}

Can we apply source filtering such that the query returns only b nodes from all object(s) in a?

I've tried something like this:

{
  "_source": {
    "excludes": [
      "a[*].c"
    ]
  },
  "query": {
    "match_all": {}
  }
}

But, it didn't work.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116
  • check this out.. You can select only the b field https://stackoverflow.com/questions/9605292/make-elasticsearch-only-return-certain-fields – Eirini Graonidou Mar 16 '18 at 20:07

1 Answers1

1

Since "a" is an array of objects to accomplish what you want, you need to define "a" as a Nested datatype. Please read "Array of Objects" note here https://www.elastic.co/guide/en/elasticsearch/reference/current/array.html

So you have to define "a" property as nested type in the mapping. I'm following the next steps from your example:

1.- Define the mapping

curl -XPUT 'localhost:9200/my_index?pretty' -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "_doc": {
      "properties": {
        "a": {
          "type": "nested" 
        }
      }
    }
  }
}
'

2.- Create document 1 with your sample data:

curl -XPUT 'localhost:9200/my_index/_doc/1?pretty' -H 'Content-Type: application/json' -d'
{
  "a" : [
    {
      "b" : [1,2,3],
      "c" : "abcd1"
    },
    {
      "b" : [4,5,6,7],
      "c" :  "abcd2"
    }
  ]
}
'

3.- And here is how you query should be, please notice nested.path when you have to specify the path to where you really want to start the query, and then the normal query

curl -XGET 'localhost:9200/my_index/_search?pretty' -H 'Content-Type: application/json' -d'
{
  "_source": "a.b",
  "query": {
    "nested": {
      "path": "a",
      "query": {
        "match_all": {}
      }
    }
  }
}
'

And this is the result with only b field in each object:

"took" : 4,
"timed_out" : false,
"_shards" : {
  "total" : 5,
  "successful" : 5,
  "skipped" : 0,
  "failed" : 0
},
"hits" : {
  "total" : 1,
  "max_score" : 1.0,
  "hits" : [
    {
      "_index" : "my_index",
      "_type" : "_doc",
      "_id" : "1",
      "_score" : 1.0,
      "_source" : {
        "a" : [
          {
            "b" : [1, 2, 3]
          },
          {
            "b" : [4, 5, 6, 7]
          }
        ]
      }
    }
  ]
}

Here the ElasticSearch reference for Nested date types https://www.elastic.co/guide/en/elasticsearch/reference/current/nested.html

Averias
  • 931
  • 1
  • 11
  • 20
  • If `a[...]` is response from aggregation query (i.e. avg aggs inside terms aggs) and I want only b (exclude c) then how to do source filtering? – MD TAREQ HASSAN Sep 25 '18 at 07:56