2

I have documents like this one at collection x at MongoDB:

{
    "_id" : ...
    "attrs" : [
        {
            "key": "A1",
            "type" : "T1",
            "value" : "13"
        },
        {
            "key": "A2",
            "type" : "T2",
            "value" : "14"
        }
    ]
}

The A1 and A2 elements above are just examples: the attrs field may hold any number of array elements.

I'd need to access concurrently to the attrs array from several independent clients accessing to MongoDB. For example, considers two clients, one wanting to change the value of the element identified by key equal to "A1" to "80" and other wanting to change the value of the element identified by key equal to "A2" to "20". Is there any compact way of doing it using MongoDB operations?

It is important to note that:

  • Clients doesn't know the position of each element in the attr array, only the key of the element which value has to be modified.
  • Reading the whole attrs array in client space, searching the element to modify at client space, then updating attrs with the new array (in which the element to modify has been changed) would involve race conditions.
  • Clients also may add and remove elements in the array. Thus, doing a first search at MongoDB to locate the position of the element to modify, then update it using that particular position doesn't work in general, as elements could have been added/removed thus altering of the position previously found.
Blakes Seven
  • 49,422
  • 14
  • 129
  • 135
fgalan
  • 11,732
  • 9
  • 46
  • 89
  • In some sense, this question is a follow-up of http://stackoverflow.com/questions/31632070/query-for-documents-which-have-an-internal-sub-field-of-a-given-value – fgalan Jul 26 '15 at 23:37

1 Answers1

3

The process here is really quite simple, it only varies in where you want to "find or create" the elements in the array.

First, assuming the elements for each key are in place already, then the simple case is to query for the element and update with the index returned via the positional $ operator:

db.collection.update(
   {
       "_id": docId, 
       "attrs": { "$elemMatch": { "key": "A1", "type": "T1" } }
   }
   { "$set": { "attrs.$.value": "20" }
)

That will only modify the element that is matched without affecting others.

In the second case where "find or create" is required and the particular key may not exist, then you use "two" update statements. But the Bulk Operations API allows you to do this in a single request to the server with a single response:

var bulk = db.collection.initializeOrderedBulkOp();

// Try to update where exists
bulk.find({
    "_id": docId,
    "attrs": { "$elemMatch": { "key": "A1", "type": "T2" } }
}).updateOne({
    "$set": { "attrs.$.value": "30" }
});

// Try to add where does noes not exist
bulk.find({
    "_id": docId,
    "attrs": { "$not": { "$elemMatch": { "key": "A1", "type": "T2" } } }
}).updateOne({
    "$push": { "attrs": { "key": "A1", "type": "T2", "value": "30" } }
});

bulk.execute();

The basic logic being that first the update attempt is made to match an element with the required values just as done before. The other condition tests for where the element is not found at all by reversing the match logic with $not.

In the case where the array element was not found then a new one is valid for addition via $push.

I should really add that since we are specifically looking for negative matches here it is always a good idea to match the "document" that you intend to update by some unique identifier such as the _id key. While possible with "multi" updates, you need to be careful about what you are doing.

So in the case of running the "find or create" process then element that was not matched is added to the array correctly, without interferring with other elements, also the previous update for an expected match is applied in the same way:

{
    "_id" : ObjectId("55b570f339db998cde23369d"),
    "attrs" : [
            {
                    "key" : "A1",
                    "type" : "T1",
                    "value" : "20"
            },
            {
                    "key" : "A2",
                    "type" : "T2",
                    "value" : "14"
            },
            {
                    "key" : "A1",
                    "type" : "T2",
                    "value" : "30"
            }
    ]
}

This is a simple pattern to follow, and of course the Bulk Operations here remove any overhead involved by sending and receiving multiple requests to and from the server. All of this hapily works without interferring with other elements that may or may not exist.

Aside from that, there are the extra benefits of keeping the data in an array for easy query and analysis as supported by the standard operators without the need to revert to JavaScript server processing in order to traverse the elements.

Blakes Seven
  • 49,422
  • 14
  • 129
  • 135
  • Thank you very much for the answer! I'd need some time to analyze in deep but it looks good. – fgalan Aug 05 '15 at 20:16