10

I am using MongoDB to search for elements that contain a list of lists where at least one item in the list matches the search parameter.

This is an example of the structure I currently have.

{  
   "Item 1":{  
      "data":[  
         ["green", 1]
      ]
   },
   "Item 2":{  
      "data":[  
         ["blue", 9],
         ["green", 1]
      ]
   }
}

I want to search for all items that have the value "green" in the data list.

I currently have this:

my_data.find({'data': {'$in': ['green']}})

however, no results are returned.

leeor
  • 17,041
  • 6
  • 34
  • 60
Gunther
  • 2,474
  • 4
  • 31
  • 45

1 Answers1

11

You need a "direct path" to any queried elements and "data" is not the "top level" of the path that you need to search here, but rather it is "Item 1" or "Item 2" within the document and "data" is a sub-element of that.

The basic case is to use "dot notation" where the root elements are "known" to at least possibly exist, and also note that the $in operator is not necessary just to match a value within an array, since MongoDB does not care. But you will need a nested $elemMatch instead:

db.my_data.find({
    "$or": [
        { "Item 1.data": { 
            "$elemMatch": { 
                "$elemMatch": { "$eq": "green" }
            }
        }},
        { "Item 2.data": {
            "$elemMatch": {
                "$elemMatch": { "$eq": "green" }
            }
        }}
    ]
})

There are no "wildcards" for matching a part of a preceeding path so it is necessary to state the complete path within an $or condition to search each possible path in the document.

If writing out all possible prefixed keys is not practical, then your only other approach is using JavaScript evaluation with $where to determine the logic for matching:

db.my_data.find(function() {
    var doc = this;
    delete doc._id;

    var matched = false;
    for ( k in doc ) {
        matched = doc[k].data.some(function(el) { 
            return el.some(function(inner) {
                return inner === "green";
            });
        });
        if (matched) break;
    }
    return matched;
})

Or some variation of that logic to actually test possible elements with "data" arrays to contain your "green" value. It's not a great solution since $where requires execution on each document in order to evaluate the condition and therefore cannot use an index to filter results. This means scanning the whole collection and is basically the slowest approach.

Also just showing the "shell shortcut" to writing that as otherwise the function() contents is just submitted as a "string" in argument to $where for pymongo or other language drivers, but the basic context is that it is JavaScript code that is evaluated on the server.

Better yet is to change your document so there is always a consistent path to the element you want to query. Such as:

{
    "items": [
        { "name": "Item 1", "data": [["green", 1]] },
        { "name": "Item 2", "data": [["blue", 9], ["green", 1]] }
    ]  
}

Then you can just issue this simple query:

 db.my_data.find({ "items.data": { "$elemMatch": { "$elemMatch": { "$eq": "green" } } } })

As the "path" is always "items.data" this allows one single condtion just for that path to test all elements

Blakes Seven
  • 49,422
  • 14
  • 129
  • 135