Sadly, the $
operator works only on arrays.
However, here is how I'd do it.
Selecting the documents
In order to get only the records which are not already dates, we have to match the $type of created
in each document and return only the documents which do not have a date type on created
. This can be achieved with
db.projects.find({"created":{ $not:{ $type:9 }}})
We want the highest selectivity here, returning only documents which we really need to process, since the following steps will be rather expensive.
Creating the dates
Next, we need to iterate over the documents returned and set the created
field accordingly. However, we have a problem, since we can not simply pass a string of seconds since epoch to Date()
. So you need to implement a parsing logic based on what you have. Let's have a look at an example for the problem.
In the mongo shell
new Date('1449224127049')
returns
ISODate("0NaN-NaN-NaNTNaN:NaN:NaNZ")
The problem here is that we need to implement a parsing logic, which is not easy to achieve, since parseInt("2015-12-04T11:27:36.806Z")
happily returns 2015
., This parsing logic can get pretty complex and a bit out of scope of this answer, however, you'd need to implement it based on your needs and data.
Updating the documents
Updating the documents can be achieved pretty easily and with good performance using [bulk operations][mongo:bulk]. Basically, you "queue" operations. The pitfall here would be to write back the complete document, which might interfere with other write operations. So we need to explicitly use $set
on the created
field in order to minimize the impact of our operation.
Putting it all together
Given you example, here is how you can achieve your goal
function parseDate(value){
// This needs to be HEAVILY expanded
var intValue = parseInt(value);
if( isNaN(intValue) ){
return;
}
return new ISODate(intValue);
}
var bulk = db.projects.initializeUnorderedBulkOp();
var docs = db.projects.find({"created":{ $not:{ $type:9 }}});
var updated = 0;
docs.forEach(
function(doc){
var date = parseDate(doc.created);
if ( typeof date === "undefined" ) {
return;
} else {
bulk.find({"_id":doc._id}).update({$set:{"created":date}});
updated++;
}
// Batch size is 1000 anyway, and a status is nice.
if (updated % 1000 == 0){
print("Updated: " + updated);
bulk.execute();
// We can not reuse an executed bulk
bulk = db.projects.initializeUnorderedBulkOp();
}
}
)
// Execute the remainder of documents
bulk.execute()
Some notes
Please, whatever you do on your data: Make a backup before trying anything. Modifying your data the wrong way may easily cause loss of data or precision.
I (Markus) will post this as a community wiki answer so that others feel more free to edit my admittedly hacky JS – I am simply not too much of a JS wiz (quite the contrary).
Maybe the date parsing can be achieved through a third party library.