1

Assume a collection where each document contains an array my_array consisting of multiple dictionaries. Each dictionary in the array contains itself an entry with key my_key.

How can I efficiently update all documents in the collection, creating a new field my_list per document which shall contain all my_key values of the dictionaries in the array?

Update with an example document:

{
'_id': '123',
'irrelevant_field_1': '1',
'irrelevant_field_2': '2',
'my_array': [
            {'my_key': 'a', 'other_key_1': 'b'},
            {'my_key': 'aa', 'other_key_2': 'bb'},
            {'my_key': 'aaa', 'other_key_3': 'bbb'}
]
}

I would like to update with a new field 'my_list': ['a', 'aa', 'aaa'] at the same level as my_array.

AlexGuevara
  • 932
  • 11
  • 28
  • Some visibility would go a long way to help solve this problem; please add a sample document and the expected outcome from the above write operation. – chridam Jul 26 '19 at 07:27
  • 1
    @chridam I provided an example, I hope it is clear now. – AlexGuevara Jul 26 '19 at 07:37
  • 1
    You could use an `$out` pipeline to update in an aggregation framework where the full pipeline looks like `db.collection.aggregate( [ { $addFields : { my_list : "$my_array.my_key" } }, { $out : "collection" } ] )` just like in this possible duplicate https://stackoverflow.com/questions/3974985/update-mongodb-field-using-value-of-another-field – chridam Jul 26 '19 at 07:56
  • 1
    @chridam this works, what I didn't know about is that I can use simply "$my_array.my_key". Is it possible to obtain the same outcome also with an UpdateMany operation instead of an aggregation? If you provide your comment as an answer I will gladly accept it. I would also be interested if UpdateMany would be quicker. thank you! – AlexGuevara Jul 26 '19 at 12:17

1 Answers1

1

With MongoDB 4.2 and above, you can do an update with the aggregation pipeline as follows:

db.collection.updateMany(
    {},
    [
        { $set: { my_list : "$my_array.my_key" } }
    ]
)

For the other versions however you would need to do two queries i.e. one to get the desired field through an aggregation operation and the other query to update the documents:

const docs = db.collection.aggregate([ { $project:  { my_list : "$my_array.my_key" } } ])


const ops = []
docs.forEach(({ my_list }) => { 
    ops.push( { 
        'updateMany': {
            'filter': {},
            'update': { '$set': { my_list } }
        }
    })

    if ( ops.length === 500 ) {
        db.collection.bulkWrite(ops);
        ops = []
    }
})

if ( ops.length > 0 ) {
    db.collection.bulkWrite(ops)
}
chridam
  • 100,957
  • 23
  • 236
  • 235