11

To explain, look at the object below as it is being changed:

obj = {'a': 1, 'b': 2} // Version 1
obj['a'] = 2 // Version 2
obj['c'] = 3 // Version 3

I want to be able to get any of these versions of the object, for e.g. get obj as of version 2. I don't want to store copies of the entire object every single time I want to update a single key.

How can I achieve this functionality?

The actual object I'm trying to do this with has about 500,000 keys. That's why I don't want to store entire copies of it with every update. My preferred language that this theoretical solution should be coded in are python or javascript, but I'll take anything.

trincot
  • 317,000
  • 35
  • 244
  • 286
josneville
  • 449
  • 2
  • 5
  • 15
  • Where do you want to store that and how do you want to retrieve it? When you say "version control", my first thought is git. It won't care whether it is json or some other format. And it will save only the differences. Then again, when you say 500 000 name:value pairs, I would say a database sounds like a good idea. – zvone Nov 08 '16 at 22:54
  • Please note that the objects you speak of are not JSON. What you are defining and changing in that code example, is a JavaScript object; not JSON. NB: StackOverfow is not intended for requesting suggestions on libraries (see http://stackoverflow.com/help/on-topic). – trincot Nov 09 '16 at 22:22
  • @zvone, I currently have a database system that keeps logs and version controls data decently well, but I'm finding it extremely slow. I was looking for an algorithm in code that could version control json objects and then just store the json object as a whole in a database. – josneville Nov 09 '16 at 22:31
  • @trincot, I didn't know that requesting suggestions for libraries was off-topic. Thank you for pointing that out. – josneville Nov 09 '16 at 22:32

3 Answers3

16

You could use ES6 proxies for that. These would trap any read/write operation on your object and log each change in a change log that can be used for rolling changes back and forward.

Below is a basic implementation, which might need some more features if you intend to apply other than basic update operations on your object. It allows to get the current version number and move the object back (or forward) to a specific version. Whenever you make a change to the object, it is first moved to its latest version.

This snippet shows some operations, like changing a string property, adding to an array, and shifting it, while moving back and forward to other versions.

Edit: It now also has capability to get the change log as an object, and apply that change log to the initial object. This way you can save the JSON of both the initial object and the change log, and replay the changes to get the final object.

function VersionControlled(obj, changeLog = []) {
    var targets = [], version = 0, savedLength, 
        hash = new Map([[obj, []]]),
        handler = {
            get: function(target, property) {
                var x = target[property];
                if (Object(x) !== x) return x;
                hash.set(x, hash.get(target).concat(property));
                return new Proxy(x, handler);
            },
            set: update,
            deleteProperty: update
        };

    function gotoVersion(newVersion) {
        newVersion = Math.max(0, Math.min(changeLog.length, newVersion));
        var chg, target, path, property,
            val = newVersion > version ? 'newValue' : 'oldValue';
        while (version !== newVersion) {
            if (version > newVersion) version--;
            chg = changeLog[version];
            path = chg.path.slice();
            property = path.pop();
            target = targets[version] || 
                     (targets[version] = path.reduce ( (o, p) => o[p], obj ));
            if (chg.hasOwnProperty(val)) {
                target[property] = chg[val];
            } else {
                delete target[property];
            }
            if (version < newVersion) version++;
        }
        return true;
    }
    
    function gotoLastVersion() {
        return gotoVersion(changeLog.length);
    }
    
    function update(target, property, value) {
        gotoLastVersion(); // only last version can be modified
        var change = {path: hash.get(target).concat([property])};
        if (arguments.length > 2) change.newValue = value;
        // Some care concerning the length property of arrays:
        if (Array.isArray(target) && +property >= target.length) {
            savedLength = target.length;
        }
        if (property in target) {
            if (property === 'length' && savedLength !== undefined) {
                change.oldValue = savedLength;
                savedLength = undefined;
            } else {
                change.oldValue = target[property];
            }
        }
        changeLog.push(change);
        targets.push(target);
        return gotoLastVersion();
    }
    
    this.data = new Proxy(obj, handler);
    this.getVersion = _ => version;
    this.gotoVersion = gotoVersion;
    this.gotoLastVersion = gotoLastVersion;
    this.getChangeLog = _ => changeLog;
    // apply change log
    gotoLastVersion();
}

// sample data
var obj = { list: [1, { p: 'hello' }, 3] };

// Get versioning object for it
var vc = new VersionControlled(obj);
obj = vc.data; // we don't need the original anymore, this one looks the same

// Demo of actions:
console.log(`v${vc.getVersion()} ${JSON.stringify(obj)}. Change text:`);
obj.list[1].p = 'bye';
console.log(`v${vc.getVersion()} ${JSON.stringify(obj)}. Bookmark & add property:`);
var bookmark = vc.getVersion();
obj.list[1].q = ['added'];
console.log(`v${vc.getVersion()} ${JSON.stringify(obj)}. Push on list, then shift:`);
obj.list.push(4); // changes both length and index '4' property => 2 version increments
obj.list.shift(); // several changes and a deletion
console.log(`v${vc.getVersion()} ${JSON.stringify(obj)}. Go to bookmark:`);
vc.gotoVersion(bookmark);

console.log(`v${vc.getVersion()} ${JSON.stringify(obj)}. Go to last version:`);
vc.gotoLastVersion();
console.log(`v${vc.getVersion()} ${JSON.stringify(obj)}. Get change log:`);
var changeLog = vc.getChangeLog();
for (var chg of changeLog) {
    console.log(JSON.stringify(chg));
}

console.log('Restart from scratch, and apply the change log:');
obj = { list: [1, { p: 'hello' }, 3] };
vc = new VersionControlled(obj, changeLog);
obj = vc.data;
console.log(`v${vc.getVersion()} ${JSON.stringify(obj)}`);
.as-console-wrapper { max-height: 100% !important; top: 0; }
trincot
  • 317,000
  • 35
  • 244
  • 286
  • so its basicaly a repository? – Nina Scholz Nov 09 '16 at 07:14
  • It is like a game: you can play moves, but also take them back, and then move forward again to the last recorded move. – trincot Nov 09 '16 at 09:31
  • @trincot, I'm not that familiar with ES6 syntax. I'm still trying to understand the codebase and will get back to you. Thank you for answering in such detail. – josneville Nov 09 '16 at 22:34
  • @trincot, this is working great. Thank you very much. – josneville Nov 10 '16 at 00:10
  • 1
    There is a hoisting bug in the `update` function, the variable `savedLength` should be in the `update` function scope. Otherwise, it will sometimes set value to `undefined`. I maded the changes here https://gist.github.com/hieunc229/ca6aeb507de42a933bafe31e0a174dd0 – Hieu Nguyen Dec 08 '20 at 05:30
  • 1
    That is not a bug, @Hieu. It is intended like that. When it is set, that state must be preserved across two consecutive calls of `update`, the first that sets the `length`, the second that sets the array-element value. Otherwise the `gotoVersion` function, when reverting to an older version, will not correctly set the array length back to what it was. It is also intended that `savedLength` will become `undefined`. – trincot Dec 08 '20 at 06:36
  • @trincot Ah, that's correct (my problem was that I pre-assign `savedLength = 0`, which caused weird behaviors). Besides, I found that `savedLength` only need to assigned once. Otherwise, when add multiple value to and array (i.e `arrayValue.splice(lastPosition, 0, ...multiple values)`), each new value will update the `savedLength` to the new length – Hieu Nguyen Dec 08 '20 at 10:10
  • 1
    Also that repeated update is necessary in my code, because that `splice` does not result in a *single* version update: each insertion of an array value triggers a separate version, and so the individual updates to `savedLength` are really needed in my answer. However, I wrote this 4 years ago, so I am not so familiar with it anymore. – trincot Dec 08 '20 at 10:14
  • Absolutely, I realized that! – Hieu Nguyen Dec 08 '20 at 10:18
0

You dont need to save the whole object.

Just the differences. For each version.

This function will do a deep compare using lodash and will return a difference between the old object and the new object.

var allkeys = _.union(_.keys(obj1), _.keys(obj2));
var difference = _.reduce(allkeys, function (result, key) {
  if (!_.isEqual(obj1[key] !== obj2[key])) {
    result[key] = {obj1: obj1[key], obj2: obj2[key]}
  }
  return result;
}, {});

You will need to keep the very first object, but you can keep the versions this way, I think.

Valentin Roudge
  • 555
  • 4
  • 14
  • Thanks for this solution. My plan is to include this functionality into the solution above to allow versioning for batch updates to keys. – josneville Nov 10 '16 at 00:11
-1

Use the Map object instead:

const obj = new Map( [
    ['name', 'bob'],['number',2]
])
const v2 = ( new Map( obj ) ).set('name','suzie')
const v3 = ( new Map( v2 ) ).set('value',3)

// obj is 'name'->'bob', 'number'->2
// v2 is  'name'->'suzie', 'number'->2
// v3 is  'name'->'suzie', 'number'->2, 'value'->3

The new Map( obj ) is a clone, but the data itself isn't copied or mutated, so it's perfect for versioning, saving a list of actions to be undone, etc.

Gavin
  • 27
  • 8