78

I have a list of documents, each with lat and lon properties (among others).

{ 'lat': 1, 'lon': 2, someotherdata [...] } 
{ 'lat': 4, 'lon': 1, someotherdata [...] }
[...]

I want to modify it so that it looks like this:

{ 'coords': {'lat': 1, 'lon': 2}, someotherdata [...]} 
{ 'coords': {'lat': 4, 'lon': 1}, someotherdata [...]}
[...]

So far I've got this:

db.events.update({}, {$set : {'coords': {'lat': db.events.lat, 'lon': db.events.lon}}}, false, true)

But it treats the db.events.lat and db.events.lon as strings. How can I reference the document's properties?

Cheers.

Pawel Decowski
  • 1,602
  • 2
  • 15
  • 23

6 Answers6

214

Update: If all you have to do is change the structure of a document without changing the values, see gipset's answer for a nice solution.


According to a (now unavailable) comment on the Update documentation page, you cannot reference the current document's properties from within an update().

You'll have to iterate through all the documents and update them like this:

db.events.find().snapshot().forEach(
  function (e) {
    // update document, using its own properties
    e.coords = { lat: e.lat, lon: e.lon };

    // remove old properties
    delete e.lat;
    delete e.lon;

    // save the updated document
    db.events.save(e);
  }
)

Such a function can also be used in a map-reduce job or a server-side db.eval() job, depending on your needs.

Community
  • 1
  • 1
Niels van der Rest
  • 31,664
  • 16
  • 80
  • 86
  • 7
    As per this comment on a similar question, I discovered that failing to snapshot() your find results, will throw the cursor into an infinite loop in some circumstances. [update field with another field's value](http://stackoverflow.com/questions/2606657/how-can-i-update-field-with-another-fields-value#comment15682010_4260176) Try `db.events.find().snapshot().forEach(` `// ..function goes here` `)` – Ashley Raiteri Sep 15 '13 at 08:49
  • @AshleyRaiteri Thanks, I've included the call to `snapshot()` in the answer. Good catch :) – Niels van der Rest Sep 15 '13 at 12:47
  • Why do you have to do `events.save()` inside the `forEach` block? Why not do `events.update()` in the same place? – Merlin -they-them- Oct 07 '16 at 04:24
  • This was very helpfull. I saved them in a new collection `db.event_new.save()` and then renamed them `db.event.renameCollection("event_old")` and `db.event_new.renameCollection("event")`, so that it gets replaced, but if any error ocurres it dosen't affect the original data. – AxelWass Oct 18 '16 at 17:56
  • First want to reiterate merlinpatt's comment: why not use an update instead? Does it matter? Secondly, I ran into problems trying to use snapshot. I am new to mongo and don't want to lead people astray, but from what I can tell, in Mongo 4.0 snapshot is gone and now you must use hint. If this is wrong please let me know: https://docs.mongodb.com/manual/core/read-isolation-consistency-recency/#cursor-snapshot – Albert Rothman Aug 07 '18 at 21:48
  • 3
    @AlbertRothman, For mongo 4.0 use `.hint({_id: 1})` instead of `.snapshot()`. – twksos Aug 17 '18 at 19:45
  • Beautiful answer. What is the difference in performance between a query in a `updateMany` and this? – Eduardo Pignatelli Sep 27 '18 at 15:47
  • DEPRECATED since Mongo 3.6 http://mongodb.github.io/mongo-java-driver/3.8/javadoc/deprecated-list.html – Elisabeth Shevtsova Dec 26 '18 at 16:23
57

The $rename operator (introduced a month after this question was posted) makes it really easy to do these kinds of things where you don't need to modify the values.

Insert some test documents

db.events.insert({ 'lat': 1, 'lon': 2, someotherdata: [] })
db.events.insert({ 'lat': 4, 'lon': 1, someotherdata: [] })

use the $rename operator

db.events.update({}, {$rename: {'lat': 'coords.lat', 'lon': 'coords.lon'}}, false, true)

Results

db.events.find()
{
    "_id" : ObjectId("5113c82dd28c4e8b79971add"),
    "coords" : {
        "lat" : 1,
        "lon" : 2
    },
    "someotherdata" : [ ]
}
{
    "_id" : ObjectId("5113c82ed28c4e8b79971ade"),
    "coords" : {
        "lat" : 4,
        "lon" : 1
    },
    "someotherdata" : [ ]
}
gipset
  • 752
  • 6
  • 6
6

Neils answer. Just to let people know you cannot run this on a large database if you are lets say doing it remote shell like Robomongo. You will need to ssh into your actual server's mongo shell. Also you could also do this if you would rather do an Update.

db.Collection.find({***/ possible query /***}).toArray().forEach(
  function(obj){
    obj.item = obj.copiedItem;
    obj.otherItem = obj.copiedItem;
    obj.thirdItem = true;
    obj.fourthItem = "string";
    db.Collection.update({_id: obj._id}, obj);
  }
);
Community
  • 1
  • 1
mjwrazor
  • 1,866
  • 2
  • 26
  • 42
  • Is this just a comment on the answer posted by Niels van der Rest? – Vince Bowdren Oct 18 '16 at 21:52
  • It is a comment but also wanted to post a more descriptive answer than you can leave in the add comments section. I also wanted to post the code that I used based on his answer so if people were to lazy to go look up those links than here is a working model. Of course only works on the server. or in a remote shell if you only need to update <101 documents. Some people might not see "save" as the best operation to use. – mjwrazor Oct 19 '16 at 01:36
3

As long as you are OK with creating a copy of the data, the aggregation framework can be used as an alternative here. You also have the option to do more to the data if you wish using other operators, but the only one you need is $project. It's somewhat wasteful in terms of space, but may be faster and more appropriate for some uses. To illustrate, I'll first insert some sample data into the foo collection:

db.foo.insert({ 'lat': 1, 'lon': 2, someotherdata : [1, 2, 3] })
db.foo.insert({ 'lat': 4, 'lon': 1, someotherdata : [4, 5, 6] })

Now, we just use $project to rework the lat and lon fields, then send them to the newfoo collection:

db.foo.aggregate([
    {$project : {_id : "$_id", "coords.lat" : "$lat", "coords.lon" : "$lon", "someotherdata" : "$someotherdata" }},
    { $out : "newfoo" }
])

Then check newfoo for our altered data:

db.newfoo.find()
{ "_id" : ObjectId("544548a71b5cf91c4893eb9a"), "someotherdata" : [ 1, 2, 3 ], "coords" : { "lat" : 1, "lon" : 2 } }
{ "_id" : ObjectId("544548a81b5cf91c4893eb9b"), "someotherdata" : [ 4, 5, 6 ], "coords" : { "lat" : 4, "lon" : 1 } }

Once you are happy with the new data, you can then use the renameCollection() command to drop the old data and use the new data under the old name:

> db.newfoo.renameCollection("foo", true)
{ "ok" : 1 }
> db.foo.find()
{ "_id" : ObjectId("544548a71b5cf91c4893eb9a"), "someotherdata" : [ 1, 2, 3 ], "coords" : { "lat" : 1, "lon" : 2 } }
{ "_id" : ObjectId("544548a81b5cf91c4893eb9b"), "someotherdata" : [ 4, 5, 6 ], "coords" : { "lat" : 4, "lon" : 1 } }

One last note - until SERVER-7944 is completed you can't do the equivalent of a snapshot by hinting the _id index as suggested in this answer and so you can end up hitting a document more than once if activity elsewhere causes it to move. Since you are inserting the _id field in this example, any such occurrence would cause a unique key violation, so you will not end up with dupes, but you might have an "old" version of a document. As always, check your data thoroughly before dropping it, and preferably take a backup.

Community
  • 1
  • 1
Adam Comerford
  • 21,336
  • 4
  • 65
  • 85
3

We can use Mongo script to manipulate data on the fly. It works for me!

I use this script to correct my address data.

Example of current address: "No.12, FIFTH AVENUE,".

I want to remove the last redundant comma, the expected new address ""No.12, FIFTH AVENUE".

var cursor = db.myCollection.find().limit(100);

while (cursor.hasNext()) {
  var currentDocument = cursor.next();

  var address = currentDocument['address'];
  var lastPosition = address.length - 1;

  var lastChar = address.charAt(lastPosition);

  if (lastChar == ",") {

    var newAddress = address.slice(0, lastPosition);


    currentDocument['address'] = newAddress;

    db.localbizs.update({_id: currentDocument._id}, currentDocument);

  }
}

Hope this helps!

Dac Nguyen
  • 77
  • 3
0

From the CLI? I think you have to pull the values out first and assign the value into a variable. Then run your update command.

Or (I haven't tried) remove 'db' from the string. events.lat and events.lon If it works, you will still have multiple values, the old values for "lat" and "lon" and the new array you created.

sdot257
  • 10,046
  • 26
  • 88
  • 122