1

I have a field in my documents, that is named after its timestamp, like so:

{
    _id: ObjectId("53f2b954b55e91756c81d3a5"),
    domain: "example.com",
    "2014-08-07 01:25:08": {
        A: [
            "123.123.123.123"
        ],
        NS: [
            "ns1.example.com.",
            "ns2.example.com."
        ]
    }
}

This is very impractical for queries, since every document has a different timestamp. Therefore, I want to rename this field, for all documents, to a fixed name. However, I need to be able to match the field names using regex, because they are all different.

I tried doing this, but this is an illegal query.

db['my_collection'].update({}, {$rename:{ /2014.*/ :"201408"}}, false, true);

Does someone have a solution for this problem?

SOLUTION BASED ON NEIL LUNN'S ANSWER:

conn = new Mongo();
db = conn.getDB("my_db");

var bulk = db['my_coll'].initializeOrderedBulkOp();
var counter = 0;

db['my_coll'].find().forEach(function(doc) {

    for (var k in doc) {
            if (k.match(/^2014.*/) ) {
                print("replacing " + k)
                var unset = {};
                unset[k] = 1;
                bulk.find({ "_id": doc._id }).updateOne({ "$unset": unset, "$set": { WK1: doc[k]} });
                counter++;
            }

    }

    if ( counter % 1000 == 0 ) {
        bulk.execute();
        bulk = db['my_coll'].initializeOrderedBulkOp();
    }

});

if ( counter % 1000 != 0 )
    bulk.execute();
Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
Thomas
  • 2,070
  • 3
  • 16
  • 21

2 Answers2

3

This is not a mapReduce operation, not unless you want a new collection that consists only of the _id and value fields that are produced from mapReduce output, much like:

    "_id": ObjectId("53f2b954b55e91756c81d3a5"), 
    "value": { 
        "domain": "example.com",
        ... 
    } 
}

Which at best is a kind of "server side" reworking of your collection, but of course not in the structure you want.

While there are ways to execute all of the code in the server, please don't try to do so unless you are really in a spot. These ways generally don't play well with sharding anyway, which is usually where people "really are in a spot" for the sheer size of records.

When you want to change things and do it in bulk, you generally have to "loop" the collection results and process the updates while having access to the current document information. That is, in the case where your "update" is "based on" information already contained in fields or structure of the document.

There is therefore not "regex replace" operation available, and there certainly is not one for renaming a field. So let's loop with bulk operations for the "safest" form of doing this without running the code all on the server.

var bulk = db.collection.initializeOrderedBulkOp();
var counter = 0;

db.collection.find().forEach(function(doc) {

    for ( var k in doc ) {
        if ( doc[k].match(/^2014.*/) ) {
            var update = {};
            update["$unset"][k] = 1;
            update["$set"][ k.replace(/(\d+)-(\d+)-(\d+).+/,"$1$2$3") ] = doc[k];
            bulk.find({ "_id": doc._id }).updateOne(update);
            counter++;
        }
    }

    if ( counter % 1000 == 0 ) {
        bulk.execute();
        bulk = db.collection.initializeOrderedBulkOp();
    }

});

if ( counter % 1000 != 0 )
    bulk.execute();

So the main things there are the $unset operator to remove the existing field and the $set operator to create the new field in the document. You need the document content to examine and use both the "field name" and "value", so hence the looping as there is no other way.

If you don't have MongoDB 2.6 or greater on the server then the looping concept still remains without the immediate performance benefit. You can look into things like .eval() in order to process on the server, but as the documentation suggests, it really is not recommended. Use with caution if you must.

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
  • Neil, thanks! This was the solution I used in the end. One note though, I changed the matching to k instead of the object doc[k] (I think this was a small bug) and I adjusted the update string. I added my concrete implementation above. – Thomas Aug 11 '14 at 13:34
0

As you already recognized, value-keys are indeed very bad for the MongoDB query language. So bad that what you want to do doesn't work.

But you could do it with a MapReduce. The map and reduce functions wouldn't do anything, but the finalize function would do the conversion in Javascript.

Or you could write a little program in a programming language of your which reads all documents from the collection, makes the change, and writes them back using collection.save.

Philipp
  • 67,764
  • 9
  • 118
  • 153
  • 2
    Thank you for you answer, Philipp. Could you explain (maybe by example) how this map reduce would look like. As you could've guessed, I'm not really experienced with this. – Thomas Aug 08 '14 at 11:52