I am doing some queries on an index products
I have in elasticsearch. I have the following documents in index products
{ "product_name": "prod-1", "meta": [ { "key": "key1", "value": "value1" }, { "key": "key2", "value": "value2" } ] }
{ "product_name": "prod-2", "meta": [ { "key": "key1", "value": "value1" } ] }
{ "product_name": "prod-2", "meta": [ { "key": "key2", "value": "value2" } ] }
{ "product_name": "prod-3", "meta": [ { "key": "key2", "value": "value2" } ] }
What I want now is to get product_name
s that have both key1/value1
and key2/value2
in meta array but not necessarily in the same document. For example, in the above data prod-1
has both key1/value1
and key2/value2
in the same document, so I want prod-1
in the result. And prod-2
also has both key1/value1
and key2/value2
but in different documents. I want prod-2
also in the result. prod-3
has only key2/value2
even across documents. So, I don't want to get prod-3
in the results.
I am trying the following apporach
- Group by product name
- Then filter the aggregation results to check each product has both
key1/value1
andkey2/value2
I group them by product_name
and combine meta fields in each bucket as follows
{
"size": 0,
"aggs": {
"by_product": {
"terms": {
"field": "product_name"
},
"aggs": {
"all_meta": {
"top_hits": {
"_source": {
"includes": [
"meta.key",
"meta.value"
]
}
}
}
}
}
}
}
The result after the above aggregation is effectively the following
"aggregations" : {
"by_product" : {
...
"buckets" : [
{
...
"key" : "prod-2",
"all_meta" : {
"hits" : {
...
"hits" : [
{
....
"_source" : {
"meta" : [
{
"value" : "value1",
"key" : "key1"
}
]
}
},
{
....
"_source" : {
"meta" : [
{
"value" : "value2",
"key" : "key2"
}
]
}
}
]
}
}
},
{
....
"key" : "prod-1",
"all_meta" : {
"hits" : {
....
"hits" : [
{
....
"_source" : {
"meta" : [
{
"value" : "value1",
"key" : "key1"
},
{
"value" : "value2",
"key" : "key2"
}
]
}
}
]
}
}
},
{
....
"key" : "prod-3",
"all_meta" : {
"hits" : {
....
"hits" : [
{
....
"_source" : {
"meta" : [
{
"value" : "value2",
"key" : "key2"
}
]
}
}
]
}
}
}
]
}
}
Now, I want to filter values from the above aggrgation and get buckets only when each bucket has both { "key": "key1", "value": "value1" }
and { "key": "key2", "value": "value2" }
in meta. Something like this
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "buckets.all_meta.hits.hits._source.meta",
"query": {
"bool": {
"must": [
{
"match": {
"buckets.all_meta.hits.hits._source.meta.key": "key1"
}
},
{
"match": {
"buckets.all_meta.hits.hits._source.meta.value": "value1"
}
}
]
}
}
}
},
{
"nested": {
"path": "buckets.all_meta.hits.hits._source.meta",
"query": {
"bool": {
"must": [
{
"match": {
"buckets.all_meta.hits.hits._source.meta.key": "key2"
}
},
{
"match": {
"buckets.all_meta.hits.hits._source.meta.value": "value2"
}
}
]
}
}
}
}
]
}
}
}
But I'm not sure how to do the above step. Is it possible to do this? This stackoverflow question is similar but it doesn't have any answers. Is there any other approach to get the results I want? Any help would be appreciated. Thanks.