5

Much ink has been spilled on the subject of testing two objects for deep equality in JavaScript. None, however, seem to care about distinguishing the following two objects:

var o1 = [{},{}];
var subitem = {};
var o2 = [subitem, subitem];
var o3 = [{}, {}];

Most deep equality algorithms would say that o1, o2 and o3 are equal. I want an algorithm that says that o1 and o2 are not equal, but o1 and o3 are equal. To put it differently, I want an algorithm that tells me if the pointer graphs have the same structure or not. I care about this because if I have an a modification of the first element is reflected in the second in o2, but not in o1.

This means that deep equality of cyclic structures should work:

var o1 = [];
o1.push(o1);
var o2 = [];
o2.push(o2);
// deepGraphEqual(o1, o2) == true
var o3 = [[]];
o3[0].push(o3);
// deepGraphEqual(o1, o3) == false

If you're going to avoid mutating the items, you are probably going to need ECMAScript6 maps, so I'll accept solutions that use those.

Edward Z. Yang
  • 26,325
  • 16
  • 80
  • 110
  • 1
    *"Most deep equality algorithms would say that o1 and o2 are equal"* - [citation needed](https://xkcd.com/285/) – nnnnnn Sep 26 '15 at 04:02
  • So you want to check if both the arrays refers to the same object i.e. you want to check the reference is equal and not the values? – Tushar Sep 26 '15 at 04:04
  • Just use `==`. `o1 == o2 // false` exactly as you want – zerkms Sep 26 '15 at 04:06
  • Hey guys, I clarified the question, there was a misunderstanding. I don't want to test if o1 and o2 point to the same thing, but I want to test if the "graphs" their pointer structure represent have the same structure – Edward Z. Yang Sep 26 '15 at 04:08
  • What makes `o2` different from `o3`? – zerkms Sep 26 '15 at 04:08
  • In `o2`, if I modify the first element, the second element reflects the change. In `o3`, if I modify the first element, the second does not. (Equivalently: you can witness differences in sharing by mutating the objects in the structure.) – Edward Z. Yang Sep 26 '15 at 04:09
  • Now convert your explanation into algorithm, then code it. `if there are several elements that refer to the same object in memory - the whole array is not equal to anything else`. I think it's JS 101, isn't it? – zerkms Sep 26 '15 at 04:10
  • I created an npm module for exactly that: http://npmjs.com/package/deep-equal-ident (because I do care :P ). It provides three different implementations. – Felix Kling Sep 26 '15 at 21:49
  • @FelixKling: Cool, I missed that somehow. The stack implementation has a bug, though; [have a patch](https://github.com/fkling/deep-equal-ident/pull/2). – Anders Kaseorg Sep 27 '15 at 02:13
  • @FelixKling: Oh, but I found another bug in all three of your implementations that looks harder to fix without changes to `lodash`: [`a = [[]]; deepEqualIdent([a, a[0]], [a, []])` incorrectly returns true](https://github.com/fkling/deep-equal-ident/issues/3). – Anders Kaseorg Sep 27 '15 at 02:31

2 Answers2

6

Version with no ES6 features that runs in quadratic time:

function deepGraphEqual(a, b) {
    var left = [], right = [], has = Object.prototype.hasOwnProperty;
    function visit(a, b) {
        var i, k;
        if (typeof a !== 'object' || typeof b !== 'object' || a === null || b === null)
            return a === b;
        if (Object.getPrototypeOf(a) !== Object.getPrototypeOf(b))
            return false;
        for (i = 0; i < left.length; i++) {
            if (a === left[i])
                return b === right[i];
            if (b === right[i])
                return a === left[i];
        }
        for (k in a)
            if (has.call(a, k) && !has.call(b, k))
                return false;
        for (k in b)
            if (has.call(b, k) && !has.call(a, k))
                return false;
        left.push(a);
        right.push(b);
        for (k in a)
            if (has.call(a, k) && !visit(a[k], b[k]))
                return false;
        return true;
    }
    return visit(a, b);
}

Version with ES6 Map that runs in linear time:

function deepGraphEqual(a, b) {
    let left = new Map(), right = new Map(), has = Object.prototype.hasOwnProperty;
    function visit(a, b) {
        if (typeof a !== 'object' || typeof b !== 'object' || a === null || b === null)
            return a === b;
        if (Object.getPrototypeOf(a) !== Object.getPrototypeOf(b))
            return false;
        if (left.has(a))
            return left.get(a) === b
        if (right.has(b))
            return right.get(b) === a
        for (let k in a)
            if (has.call(a, k) && !has.call(b, k))
                return false;
        for (let k in b)
            if (has.call(b, k) && !has.call(a, k))
                return false;
        left.set(a, b);
        right.set(b, a);
        for (let k in a)
            if (has.call(a, k) && !visit(a[k], b[k]))
                return false;
        return true;
    }
    return visit(a, b);
}
Anders Kaseorg
  • 3,657
  • 22
  • 35
  • Second algorithm really doesn't look linear. – djechlin Sep 26 '15 at 06:40
  • @djechlin: It is. For each node in the graph, we do an amount of work proportional to the number of edges from that node. We process each node at most once, so the total amount of work is linear in the size of the graph (nodes plus edges). – Anders Kaseorg Sep 26 '15 at 06:46
  • 1
    WTH is `if (a.prototype !== b.prototype)` supposed to do? – Bergi Sep 26 '15 at 13:51
  • What about `NaN`? Shouldn't it be considered equal as well? It's probably better to build on top of an existing implementation than rolling your own. – Felix Kling Sep 26 '15 at 21:47
  • 1
    @Bergi: I’ve corrected this to `if (Object.getPrototypeOf(a) !== Object.getPrototypeOf(b))`. @FelixKling: I considered this, and I did look at existing implementations, such as node.js `assert.deepEqual()`, which also treats `NaN` as unequal to itself. If you want `NaN` to be equal to itself, you can replace the `a === b` test with `Object.is(a, b)`. – Anders Kaseorg Sep 27 '15 at 01:57
0

how to improve Anders Kaseorg's answer:

If you use the algorithm on extremely large data structures, you can get a stack overflow error. That happens for example with a complete graph with 5,000 nodes. So I wrote a non-recursive version, which uses breadth first search instead of depth first search, since that seemed way easier to implement (when not using recursion). The iterative version works fine for the complete graph with 5,000 nodes (takes a whopping 6 seconds, though, on my machine). Here it is:

function deepEqual(item1, item2){
    var EQUAL_ATOM = 1, UNEQUAL = 2, OBJECT = 3;
    function compareSimple(first, second){
        var ty1 = typeof first, ty2 = typeof second;
        if (ty1!==ty2) return UNEQUAL;
        if (ty1!=='object'){
            if (first===second) return EQUAL_ATOM;
            if ((ty1==='number')&&isNaN(first)&&isNaN(second)) return EQUAL_ATOM;
            return UNEQUAL;
        }
        if (first===null) return (second===null) ? EQUAL_ATOM : UNEQUAL;
        if (second===null) return UNEQUAL;
        if (Object.getPrototypeOf(first) !== Object.getPrototypeOf(second)) return UNEQUAL;
        return OBJECT;
    }
    var c = compareSimple(item1, item2);
    if (c !== OBJECT) { return (c===EQUAL_ATOM); }
    var stack1 = [], stack2 = [], inverse1 = new Map(), inverse2 = new Map();
    stack1.push(item1); stack2.push(item2);
    inverse1.set(item1, 0); inverse2.set(item2, 0);
    var currentIdx = 0;
    var firstItem, secondItem, i, own, has1, has2, key, kid1, kid2, itemCount;
    while (currentIdx < stack1.length){
        firstItem = stack1[currentIdx]; secondItem = stack2[currentIdx];
        own = {};
        for (key in firstItem){
            has1 = firstItem.hasOwnProperty(key);
            has2 = secondItem.hasOwnProperty(key);
            if (has1 !== has2) return false;
            if (has1) { own[key] = null; }
        }
        for (key in secondItem){
            if (!(key in own)){
                has1 = firstItem.hasOwnProperty(key);
                has2 = secondItem.hasOwnProperty(key);
                if (has1 !== has2) return false;
                if (has1) { own[key] = null; }
            }
        }
        for (key in own){
            kid1 = firstItem[key];
            kid2 = secondItem[key];
            c = compareSimple(kid1, kid2);
            if (c === UNEQUAL) return false;
            if (c === OBJECT){
                has1 = inverse1.has(kid1);
                has2 = inverse2.has(kid2);
                if (has1 !== has2) return false;
                if (has1){
                    if (inverse1.get(kid1) !== inverse2.get(kid2)) { return false; }
                } else {
                    itemCount = stack1.length;
                    stack1.push(kid1); stack2.push(kid2);
                    inverse1.set(kid1, itemCount); inverse2.set(kid2, itemCount);
                }
            }
        }
        ++currentIdx;
    }
    return true;
}

I added some speed tests on the jsperf.com website. Interestingly, depending on the data structure, sometimes Anders's recursive version is faster, and sometimes my iterative version is faster, with the average being more in Anders's favor.

Here are the links to the tests on jsperf:

nephews example

cycle free real world JSON from reddit example

uncles example

complete graph with 2K nodes

complete graph with 5K nodes

Moreover, builtin objects aren't handled in the way you'll probably want. Many or most builtin objects "hide" their keys. If you do Object.keys(...), you'll just get an empty array.

now = new Date();
keys = Object.keys(now);  // result: []

Hence, for example, any 2 Dates are deepGraphEqual to each other, also any 2 RegExps. That's most probably not what you want. I don't have a "catch all" for all those, and going through all existing "builtin" objects would take really long. But as for Dates and RegExps, here is how you can put in something more reasonable, using .toString() to compare them instead.

mathheadinclouds
  • 3,507
  • 2
  • 27
  • 37