1

I need to convert all non-date types (ISO strings, unix timestamps...) to js Date objects. So is there a way to modify the value of a field natively in Mongo? I'm thinking something along the lines of $inc - as it modifies the current value in place.

For example: given the following doc structure in the "projects" collection:

{
  _id: '123456',
  created: '1449224127049',
  name: 'My document',
  ...
}

Is there a way to do something like:

db.projects.update(
  {created: {$not: {$type: 9} } },
  {$set: {created: new Date( ??? ) } },
  {multi: true}
);
styvane
  • 59,869
  • 19
  • 150
  • 156
carlevans719
  • 106
  • 5
  • 1
    I'm afraid you cannot update a field with values referring to the same (or any other) field. You will find [questions on SO](http://stackoverflow.com/questions/3974985/update-mongodb-field-using-value-of-another-field) about this issue, and also an open [feature request](https://jira.mongodb.org/browse/SERVER-458). – dgiugg Dec 04 '15 at 10:42
  • this may help http://stackoverflow.com/questions/15473772/how-to-convert-from-string-to-date-data-type – undefined_variable Dec 04 '15 at 11:10

1 Answers1

2

Sadly, the $ operator works only on arrays.

However, here is how I'd do it.

Selecting the documents

In order to get only the records which are not already dates, we have to match the $type of created in each document and return only the documents which do not have a date type on created. This can be achieved with

db.projects.find({"created":{ $not:{ $type:9 }}})

We want the highest selectivity here, returning only documents which we really need to process, since the following steps will be rather expensive.

Creating the dates

Next, we need to iterate over the documents returned and set the created field accordingly. However, we have a problem, since we can not simply pass a string of seconds since epoch to Date(). So you need to implement a parsing logic based on what you have. Let's have a look at an example for the problem.

In the mongo shell

new Date('1449224127049')

returns

ISODate("0NaN-NaN-NaNTNaN:NaN:NaNZ")

The problem here is that we need to implement a parsing logic, which is not easy to achieve, since parseInt("2015-12-04T11:27:36.806Z") happily returns 2015., This parsing logic can get pretty complex and a bit out of scope of this answer, however, you'd need to implement it based on your needs and data.

Updating the documents

Updating the documents can be achieved pretty easily and with good performance using [bulk operations][mongo:bulk]. Basically, you "queue" operations. The pitfall here would be to write back the complete document, which might interfere with other write operations. So we need to explicitly use $set on the created field in order to minimize the impact of our operation.

Putting it all together

Given you example, here is how you can achieve your goal

function parseDate(value){
  // This needs to be HEAVILY expanded
  var intValue = parseInt(value);
  if( isNaN(intValue) ){
    return;
  }
  return new ISODate(intValue);
}

var bulk = db.projects.initializeUnorderedBulkOp();

var docs = db.projects.find({"created":{ $not:{ $type:9 }}});
var updated = 0;
docs.forEach(
  function(doc){
    var date = parseDate(doc.created);
    if ( typeof date === "undefined" ) {
      return;
    } else {
      bulk.find({"_id":doc._id}).update({$set:{"created":date}});
      updated++;
    }
    // Batch size is 1000 anyway, and a status is nice.
    if (updated % 1000 == 0){
       print("Updated: " + updated);
       bulk.execute();
       // We can not reuse an executed bulk
       bulk = db.projects.initializeUnorderedBulkOp();
    }
  }
)
// Execute the remainder of documents
bulk.execute()

Some notes

  • Please, whatever you do on your data: Make a backup before trying anything. Modifying your data the wrong way may easily cause loss of data or precision.

  • I (Markus) will post this as a community wiki answer so that others feel more free to edit my admittedly hacky JS – I am simply not too much of a JS wiz (quite the contrary).

  • Maybe the date parsing can be achieved through a third party library.

Community
  • 1
  • 1
Markus W Mahlberg
  • 19,711
  • 6
  • 65
  • 89