2

I'm trying to implement versioning (based on storing the diffs between documents) in MongoDB as described in this post: Ways to implement data versioning in MongoDB.

I'm using the same approach so far but want to make queries like: "Give me all the changes between Date1 and Date2". The schema presented in the post uses just a plain incremental numbering as the key for the changes which doesn't allow this kind of querying:

{
    _id : "id of address book record",
    changes : { 
                1234567 : { "city" : "Omaha", "state" : "Nebraska" },
                1234568 : { "city" : "Kansas City", "state" : "Missouri" }
    }
}

My only guess is that to make time based queries I would have to include some time information in the key value. Perhaps an ObjectId or ISODate like so:

{
    _id : "id of address book record",
    changes : { 
                ISODate(Date) : { "city" : "Omaha", "state" : "Nebraska" },
                ObjectId(id) : { "city" : "Kansas City", "state" : "Missouri" }
    }
}

I googled quite a while for a solution but couldn't find anything really useful. If someone has an idea on how to do this, it would be great if you can help me out. It might even be that this is the completely wrong way to approach the problem and that there is a better way to solve this issue ... I am open to any suggestions!

Thanks for helping me out...

Community
  • 1
  • 1
evermean
  • 1,255
  • 21
  • 49

1 Answers1

3

Using a IsoDate (or simple Date) for versioning is a good approach. However, in order to search for it efficiently, you shouldn't use it as a field identifier, but instead make the changes an array and put the date into the sub-documents in this array:

{
    _id : "id of address book record",
    changes : [ 
                { "change_date": ISODate(Date), "city" : "Omaha", "state" : "Nebraska" },
                { "change_date": ISODate(Date), "city" : "Kansas City", "state" : "Missouri" }
    ]
}

This allows you to use the changes.change_date field for queries with $gte or $lte. You can also find the most recent one with the $max aggregation grouping operator (not to be confused with the $max query operator, which does something completely different).

You might also want to add an index to changes.change_date to speed up these queries.

Philipp
  • 67,764
  • 9
  • 118
  • 153