17

I am trying to perform several insertions on an existent Mongo DB collection using the following code

db.dados_meteo.aggregate( [
                  { $match : { "POM" : "AguiardaBeira" } },
                  { $project : {
                     _id : { $concat: [
                        "0001:",
                      { $substr: [ "$DTM", 0, 4 ] },
                      { $substr: [ "$DTM", 5, 2 ] },
                      { $substr: [ "$DTM", 8, 2 ] },
                      { $substr: [ "$DTM", 11, 2 ] },
                      { $substr: [ "$DTM", 14, 2 ] },
                      { $substr: [ "$DTM", 17, 2 ] }
                       ] },
                    "RNF" : 1, "WET":1,"HMD":1,"TMP":1 } },
                  { $out : "dados_meteo_reloaded" }
              ] )

But each time I change the $match parameters and make a new aggregation, Mongo DB deletes the previous documents and inserts the new result.

Could you help me?

Hugo
  • 1,558
  • 12
  • 35
  • 68
  • Possible duplicate of [How to aggregate and merge the result into a collection?](https://stackoverflow.com/questions/20976569/how-to-aggregate-and-merge-the-result-into-a-collection) – NatNgs Jul 05 '17 at 08:56

3 Answers3

20

Starting Mongo 4.2, the new $merge aggregation operator (similar to $out) allows merging the result of an aggregation pipeline into the specified collection:

Given this input:

db.source.insert([
  { "_id": "id_1", "a": 34 },
  { "_id": "id_3", "a": 38 },
  { "_id": "id_4", "a": 54 }
])
db.target.insert([
  { "_id": "id_1", "a": 12 },
  { "_id": "id_2", "a": 54 }
])

the $merge aggregation stage can be used as such:

db.source.aggregate([
  // { $whatever aggregation stage, for this example, we just keep records as is }
  { $merge: { into: "target" } }
])

to produce:

// > db.target.find()
{ "_id" : "id_1", "a" : 34 }
{ "_id" : "id_2", "a" : 54 }
{ "_id" : "id_3", "a" : 38 }
{ "_id" : "id_4", "a" : 54 }

Note that the $merge operator comes with many options to specify how to merge inserted records conflicting with existing records.

In this case (with the default options), this:

  • keeps the target collection's existing documents (this is the case of { "_id": "id_2", "a": 54 })

  • inserts documents from the output of the aggregation pipeline into the target collection when they are not already present (based on the _id - this is the case of { "_id" : "id_3", "a" : 38 })

  • replaces the target collection's records when the aggregation pipeline produces documents existing in the target collection (based on the _id - this is the case of { "_id": "id_1", "a": 12 } replaced by { "_id" : "id_1", "a" : 34 })

Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
  • 1
    $merge has a limitation too, it will not allow merging into the same collection in which we are performing the aggregation – humble_wolf Feb 22 '20 at 12:07
  • 4
    Starting `Mongo 4.4`, `$merge` can output to the same collection that is being aggregated: https://docs.mongodb.com/master/release-notes/4.4/#merge – Xavier Guihot Feb 22 '20 at 12:46
15

The short answer is "you can't":

If the collection specified by the $out operation already exists, then upon completion of the aggregation, the $out stage atomically replaces the existing collection with the new results collection. The $out operation does not change any indexes that existed on the previous collection. If the aggregation fails, the $out operation makes no changes to the pre-existing collection.

As a workaround, you can copy the collection document specified by $out to a "permanent" collection just after aggregation, in one of a several ways (non of which is ideal though):

  • copyTo() is the easiest, mind the Warning. Don't use other for small results.
  • Use JS: db.out.find().forEach(function(doc) {db.target.insert(doc)})
  • Use mongoexport / mongoimport
Ori Dar
  • 18,687
  • 5
  • 58
  • 72
  • 37
    can't believe they didn't have a way to append. and this is a db that has millions in funding? pathetic. – Shai UI Jul 31 '15 at 01:09
  • 2
    Ori, your answer looks good. may be you want to edit it and add this option http://stackoverflow.com/a/37433640/2834978. db.col1.insert(db.col2.aggregate(...).toArray()) – LMC Nov 01 '16 at 14:20
1

It's not the prettiest thing ever, but as another alternative syntax (from a post-processing archive/append operation)...

db.targetCollection.insertMany(db.runCommand(
{
    aggregate: "sourceCollection",
    pipeline: 
    [
        { $skip: 0 },
        { $limit: 5 },
        { 
            $project:
            {
                myObject: "$$ROOT",
                processedDate: { $add: [new ISODate(), 0] }
            }
        }
    ]
}).result)

I'm not sure how this stacks up against the forEach variant, but i find it more intuitive to read.

Jesse MacNett
  • 459
  • 4
  • 8