0

Disclaimer: I am very new to ES! Sorry if this is an obvious question!

I have the following user example index

{"user":{
"id":"ID-1",
"documents":3,
"docs":[
  {"status":"approved"},
        {"status":"approved"},
        {"status":"approved"}
]
}
}

how do I get users that number of approved docs match the number of documents?

Peggy
  • 394
  • 6
  • 22

1 Answers1

2

In this post explains why it is not possible to iterate over the nested type field. It also shows an example of how to solve it with the help of copy_to and include_in_root.

However, as the status is the same within the array, the copy_to and include_in_root functions will only count once (elastic behavior). What I mean is, for docs it has status [approved, approved], when using copy_to the new variable will only have 1 approved. If docs has status [approved, disapproved], the new variable will only have 2 elements.

I have a suggestion but you would need to process the data before indexing, if you want to hear it I can post it.

From the information I used this mapping:

PUT my-index
{
  "mappings": {
    "properties": {
      "user": {
        "properties": {
          "id": {
            "type": "keyword"
          },
          "documents": {
            "type": "long"
          },
          "docs": {
            "type": "nested",
            "include_in_root": true,
            "properties": {
              "status": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}

Note that I added the include_in_root property.

Indexing Data:

POST my-index/_doc
{
  "user": {
    "id": "ID-1",
    "documents": 3,
    "docs": [
      {
        "status": "approved_1"
      },
      {
        "status": "approved_2"
      },
      {
        "status": "approved_3"
      }
    ]
  }
}

At this point you will need to make some changes before indexing. I figured the "documents" field is a count of the "docs" array so each document has its status. To solve the problem I mentioned above it will be necessary to add a character to the end of the status.

What I thought is to add at the end of each status the doc number, for example, "approved_1", "approved_2", "disapproved_3".

This way it will be possible to iterate the array and make a status comparison, if all are approved and the total is equal to the value in the "documents" property, the document is returned.

GET my-index/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "script": {
            "script": {
              "inline": """
                  def count = 0;
                  def status = doc['user.docs.status'];
                  for(int i = 0; i < status.size(); i++) {
                    def real_status = status[i].splitOnToken('_')[0];
                    if(real_status == 'approved') {
                      count++;
                    }
                  }
                  if(count == doc['user.documents'].value){
                    return true;
                  }
                  """
            }
          }
        }
      ]
    }
  }
}
rabbitbr
  • 2,991
  • 2
  • 4
  • 17
  • Thank you for explaining this. I'd appreciate your thoughts on the other solution you mentioned – Peggy Aug 18 '22 at 01:59