27

I need to replace a string in certain documents. I have googled this code, but it unfortunately does not change anything. I am not sure about the syntax on the line bellow:

pulpdb = db.getSisterDB("pulp_database");
var cursor = pulpdb.repos.find();
while (cursor.hasNext()) {
  var x = cursor.next();
  x['source']['url'].replace('aaa', 'bbb'); // is this correct?
  db.foo.update({_id : x._id}, x);
}

I would like to add some debug prints to see what the value is, but I have no experience with MongoDB Shell. I just need to replace this:

{ "source": { "url": "http://aaa/xxx/yyy" } }

with

{ "source": { "url": "http://bbb/xxx/yyy" } }
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
lzap
  • 16,417
  • 12
  • 71
  • 108
  • The Mongo shell runs arbitrary javascript which suggests that youre code works. Have you simply tried it? – Derick Apr 06 '12 at 10:56

4 Answers4

38

It doesn't correct generally: if you have string http://aaa/xxx/aaa (yyy equals to aaa) you'll end up with http://bbb/xxx/bbb. But if you ok with this, code will work.

To add debug info use print function:

var cursor = db.test.find();
while (cursor.hasNext()) {
  var x = cursor.next();
  print("Before: "+x['source']['url']);
  x['source']['url'] = x['source']['url'].replace('aaa', 'bbb');
  print("After: "+x['source']['url']);
  db.test.update({_id : x._id}, x);
}

(And by the way, if you want to print out objects, there is also printjson function)

om-nom-nom
  • 62,329
  • 13
  • 183
  • 228
  • Oh I did not try "print" :-) That simple! Okay, I can see the data are coming in, I guess I have a snag in the regexp (the real case is not xxx but https://abc.blablab.com) – lzap Apr 06 '12 at 11:06
  • Got it - I had to do x['source']['url'] = x['source']['url'].replace(...) instead. – lzap Apr 06 '12 at 11:10
  • Hmm for some strange reason the variable is replaced but the data is not stored then. Do I need to perform a commit or something? I still see old data there. – lzap Apr 06 '12 at 11:13
  • @Izap Have you changed db name in the last line? – om-nom-nom Apr 06 '12 at 11:18
  • Sorry, I am new to this. Is there a way to do this between two different collections? @om-nom-nom – LearningEveryday Dec 30 '15 at 20:33
  • nice answer.. worked perfect.. can I know what language is this syntac.. is this specific to mongoshell? – Sravan Jul 21 '17 at 12:45
  • @Sravan it's javascript in mongoshell :-) mongo does some exports for you, and usual javascripts becames valid – om-nom-nom Jul 21 '17 at 19:57
  • i try this code but i see Updated 0 record(s) after every update call. – Amir Azizkhani Oct 21 '19 at 12:37
4

The best way to do this if you are on MongoDB 2.6 or newer is looping over the cursor object using the .forEach method and update each document usin "bulk" operations for maximum efficiency.

var bulk = db.collection.initializeOrderedBulkOp();
var count = 0;

db.collection.find().forEach(function(doc) {
    print("Before: "+doc.source.url);
    bulk.find({ '_id': doc._id }).update({
        '$set': { 'source.url': doc.source.url.replace('aaa', 'bbb') }
    })
    count++;
    if(count % 200 === 0) {
        bulk.execute();
        bulk = db.collection.initializeOrderedBulkOp();
    }

// Clean up queues
if (count > 0) 
    bulk.execute();

From MongoDB 3.2 the Bulk() API and its associated methods are deprecated you will need to use the db.collection.bulkWrite() method.

You will need loop over the cursor, build your query dynamically and $push each operation to an array.

var operations = [];
db.collection.find().forEach(function(doc) {
    print("Before: "+doc.source.url);
    var operation = {
        updateOne: { 
            filter: { '_id': doc._id }, 
            update: { 
                '$set': { 'source.url': doc.source.url.replace('aaa', 'bbb') }
            }
        }
    };
    operations.push(operation);
})
operations.push({ 
    ordered: true, 
    writeConcern: { w: "majority", wtimeout: 5000 } 
})

db.collection.bulkWrite(operations);
styvane
  • 59,869
  • 19
  • 150
  • 156
2

Nowadays,

  • starting Mongo 4.2, db.collection.updateMany (alias of db.collection.update) can accept an aggregation pipeline, finally allowing the update of a field based on its own value.
  • starting Mongo 4.4, the new aggregation operator $replaceOne makes it very easy to replace part of a string.
// { "source" : { "url" : "http://aaa/xxx/yyy" } }
// { "source" : { "url" : "http://eee/xxx/yyy" } }
db.collection.updateMany(
  { "source.url": { $regex: /aaa/ } },
  [{
    $set: { "source.url": {
      $replaceOne: { input: "$source.url", find: "aaa", replacement: "bbb" }
    }}
  }]
)
// { "source" : { "url" : "http://bbb/xxx/yyy" } }
// { "source" : { "url" : "http://eee/xxx/yyy" } }
  • The first part ({ "source.url": { $regex: /aaa/ } }) is the match query, filtering which documents to update (the ones containing "aaa")
  • The second part ($set: { "source.url": {...) is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline):
    • $set is a new aggregation operator (Mongo 4.2) which in this case replaces the value of a field.
    • The new value is computed with the new $replaceOne operator. Note how source.url is modified directly based on the its own value ($source.url).

Note that this is fully handled server side which won't allow you to perform the debug printing part of your question.

Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
1

MongoDB can do string search/replace via mapreduce. Yes, you need to have a very special data structure for it -- you can't have anything in the top keys but you need to store everything under a subdocument under value. Like this:

{
    "_id" : ObjectId("549dafb0a0d0ca4ed723e37f"),
    "value" : {
            "title" : "Top 'access denied' errors",
            "parent" : "system.admin_reports",
            "p" : "\u0001\u001a%"
    }
}

Once you have this neatly set up you can do:

$map = new \MongoCode("function () {
  this.value['p'] = this.value['p'].replace('$from', '$to');
  emit(this._id, this.value);
}");
$collection = $this->mongoCollection();
// This won't be called.
$reduce = new \MongoCode("function () { }");
$collection_name = $collection->getName();
$collection->db->command([
  'mapreduce' => $collection_name,
  'map' => $map,
  'reduce' => $reduce,
  'out' => ['merge' => $collection_name],
  'query' => $query,
  'sort' => ['_id' => 1],
]);
chx
  • 11,270
  • 7
  • 55
  • 129
  • This isn't a correct approach to the problem - mapReduce can produce a new result set, it should not be used to "replace" existing values this way. Plus you are depending on something extremely specific - formatting your collection this way just to output _id,value pairs seems way more complicated than the already given answer to do it by iterating over documents in the shell. – Asya Kamsky Dec 26 '14 at 23:20
  • Not all web applications have privileges to execute shell commands. Another approach would be to retrieve all into PHP, replace and save back but in server surely is faster. Finally, could you quote some official documentation as why it shouldn't be used this way? I haven't read anything saying you shouldn't merge into the source. – chx Dec 27 '14 at 05:24
  • you are neither mapping nor reducing :) Basically, you are overwriting and that's not really the purpose of "mapReduce" - you are literally doing an update of each document. At best, this can be described as a hack (that only works on this exact specific format of the document) – Asya Kamsky Dec 29 '14 at 16:35
  • Yes. this is a hack. Of course it is. And yet. It's a useful hack. – chx Dec 29 '14 at 20:12