1
db.projects.findOne({"_id": "5CmYdmu2Aanva3ZAy"},
{
  "responses": {
    "$elemMatch": {
      "match.nlu": {
        "$elemMatch": {
          "intent": "intent1",
          "$and": [
            {
              "$or": [
                {
                  "entities.entity": "entity1",
                  "entities.value": "value1"
                },
                {
                  "entities.entity": "entity1",
                  "entities.value": {
                    "$exists": false
                  }
                }
              ]
            }
          ],
          "entities.1": {
            "$exists": false
          }
        }
      }
    }
  }
})

In a given project I need a projection containing only one response, hence $elemMatch. Ideally, look for an exact match:

{
    "entities.entity": "entity1",
    "entities.value": "value1"
}

But if such a match doesn't exist, look for a record where entities.value does not exist

The query above doesn't work because if it finds an item with entities.value not set it will return it. How can I get this fallback logic in a Mongo query

Here is an example of document

{
    "_id": "5CmYdmu2Aanva3ZAy",
    "responses": [  
      {
        "match": {
          "nlu": [
            {
              "entities": [],
              "intent": "intent1"
            }
          ]
        },
        "key": "utter_intent1_p3vE6O_XsT"
      },
      {
        "match": {
          "nlu": [
            {
              "entities": [{
                  "entity": "entity1",
                  "value": "value1"
                }],
              "intent": "intent1"
            }
          ]
        },
        "key": "utter_intent1_p3vE6O_XsT"
      },
      {
        "match": {
          "nlu": [
            {
              "intent": "intent2",
              "entities": []
            },
            {
              "intent": "intent1",
              "entities": [
                {
                  "entity": "entity1"
                }
              ]
            }
          ]
        },
        "key": "utter_intent2_Laag5aDZv2"
      }
    ]
}
Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
znat
  • 13,144
  • 17
  • 71
  • 106
  • Show what an actual document looks like and what you expect to receive in response. If you have different cases, then show different documents and the different expected responses. – Neil Lunn Mar 07 '19 at 23:34
  • @NeilLunn absolutely. Question updated – znat Mar 08 '19 at 00:39
  • Is there something in the provided answer that you believe does not address your question? If so then please comment on the answer to clarify what exactly needs to be addressed that has not. If it does in fact answer the question you asked then please note to [Accept your Answers](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work) to the questions you ask – Neil Lunn Mar 11 '19 at 22:10
  • Hi @NeilLunn, thank you so much for your answer. I took a couple of days offs, hence the delay – znat Mar 13 '19 at 00:09

1 Answers1

1

To answer the question, the first thing to start with is that doing what you want is not as simple as an $elemMatch projection and requires special projection logic of the aggregation framework. The second main principle here is "nesting arrays is a really bad idea", and this is exactly why:

db.collection.aggregate([
  { "$match": {  "_id": "5CmYdmu2Aanva3ZAy"  } },
  { "$addFields": {
    "responses": {
      "$filter": {
        "input": {
          "$map": {
            "input": "$responses",
            "in": {
              "match": {
                "nlu": {
                  "$filter": {
                    "input": {
                      "$map": {
                        "input": "$$this.match.nlu",
                        "in": {
                          "entities": {
                            "$let": {
                              "vars": {
                                "entities": {
                                  "$filter": {
                                    "input": "$$this.entities",
                                    "cond": {
                                      "$and": [
                                        { "$eq": [ "$$this.entity", "entity1" ] },
                                        { "$or": [
                                          { "$eq": [ "$$this.value", "value1" ] },
                                          { "$ifNull": [ "$$this.value", false ] }
                                        ]}
                                      ]
                                    }
                                  }
                                }
                              },
                              "in": {
                                "$cond": {
                                  "if": { "$gt": [{ "$size": "$$entities" }, 1] },
                                  "then": {
                                    "$slice": [
                                      { "$filter": {
                                        "input": "$$entities",
                                        "cond": { "$eq": [ "$$this.value", "value1" ] }
                                      }},
                                      0
                                    ]
                                  },
                                  "else": "$$entities"
                                }
                              }
                            }
                          },
                          "intent": "$$this.intent"
                        }
                      }
                    },
                    "cond": { "$ne": [ "$$this.entities", [] ] }
                  }
                }
              },
              "key": "$$this.key"
            }
          }
        },
        "cond": { "$ne": [ "$$this.match.nlu", [] ] }
      }
    }
  }}
])

Will return:

{
  "_id" : "5CmYdmu2Aanva3ZAy",
  "responses" : [
    {
      "match" : {
        "nlu" : [
          {
            "entities" : [
              {
                      "entity" : "entity1",
                      "value" : "value1"
              }
            ],
            "intent" : "intent1"
          }
        ]
      },
      "key" : "utter_intent1_p3vE6O_XsT"
    }
  ]
}

That is extracting ( as best I can determine your specification ), the first matching element from the nested inner array of entities where the conditions for both entity and value are met OR where the value property does not exist.

Note the additional fallback in that if both conditions meant returning multiple array elements, then only the first match where the value was present and matching would be the result returned.

Querying deeply nested arrays requires chained usage of $map and $filter in order to traverse those array contents and return only items which match the conditions. You cannot specify these conditions in an $elemMatch projection, nor has it even been possible until recent releases of MongoDB to even atomically update such structures without overwriting significant parts of the document or introducing problems with update concurrency.

More detailed explanation of this is on my existing answer to Updating a Nested Array with MongoDB and from the query side on Find in Double Nested Array MongoDB.

Note that both responses there show usage of $elemMatch as a "query" operator, which is really only about "document selection" ( therefore does not apply to an _id match condition ) and cannot be used in concert with the former "projection" variant nor the positional $ projection operator.

You would be advised then to "not nest arrays" and instead take the option of "flatter" data structures as those answers already discuss at length.

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
  • I figured that I would have to use the aggregation framework. I realized that I could simplify it a bit by assuming that responses with more entity values sets are preferred. Using a reduce and a group steps, sorting by count and taking the first value works withou - unwind responses and responses.match.nlu - match entities and entities values (this returns all matching entities, with or without value) - project->reduce: a new field containing an array with all entities values - sort and group (with first) to select the best response. It ends up being readable . – znat Mar 13 '19 at 00:23
  • However I am wondering about the performance impact of unwinding at the beginning. Thank you so much for your detailed answer and recommendations – znat Mar 13 '19 at 00:24
  • @znat The performance is generally terrible with `$unwind` and that basically scales to get worse over sharded clusters. The basic concept here is nothing that you actually asked for needs to "group" on something "inside" the array content, nor across "multiple documents" and is indeed a "per document" operation. In that case, even though this "looks" more complicated than you think it need be, it really is not. Anyhow the ***real lesson*** here "should" be **do not nest arrays**. I know you "think" this is how you do relational structure, but it really isn't. I made the same mistakes. – Neil Lunn Mar 13 '19 at 06:02
  • I realize the limitations of nested arrays. But I have to live with them for a while until we refactor our data model – znat Mar 13 '19 at 12:35