0

I am aware of several approaches to versioning individual documents in a MongoDB server. I'd like, however, to add both versioning of individual documents and versioning of entire sets. I.e. if document A changes from version 1 to 2, I'd also like to know what when A is at version 2, what versions are documents B, C, D, etc.

The underlying goal here is data provenance. Say I have a query Q on a set of documents D, generating result R. I'd like to save D+Q=R, so when D becomes D' because a document has changed, I have D'+Q=R'.

Anythoughts on the best strategy for doing this in MongoDB? I can keep separate documents that has versions of all other documents, but that seems very expensive to run queries against. I can use timestamp rather than versions, which may work okay. Are there supports for this kind of thing in MongoDB that I am not aware of?

Thanks.

Community
  • 1
  • 1
Overclocked
  • 1,187
  • 1
  • 11
  • 22

1 Answers1

0

There's no specific support for a feature like this in MongoDB, you'll have to implement it in your application code. It sounds like you have a denormalization pattern: R is effectively a cache of Q that you must invalidate when D changes. You can either do this invalidation synchronously, when you update any documents in D, or you can install some task queue (e.g. RabbitMQ) and insert a task into it whenever you change D, and have a background process pull tasks from the queue. The background process then updates R. The specific implementation depends on how consistent you need R to be.

A. Jesse Jiryu Davis
  • 23,641
  • 4
  • 57
  • 70