3
var ones = [1];
ones[1] = ones;
ones;
// => [1, [1, [1, [1, [1, ...]]]]]

var ones_ = [1];
ones_[1] = ones_;
ones_;
// => [1, [1, [1, [1, [1, ...]]]]]

How does one determine that ones and ones_ are equal? Is there an algorithm which handles circular structures such as the above?

davidchambers
  • 23,918
  • 16
  • 76
  • 105
  • 1
    Just to clarify, do you also wish to consider the object `var foo = [1]; foo[1] = [1, foo];` to be equal to the objects above? – Ilmari Karonen Nov 16 '16 at 00:04
  • Correct. The identifier which happens to be associated with a value is not relevant. – davidchambers Nov 16 '16 at 00:13
  • Specifically I'm interested in updating [`Z.equals`](https://github.com/sanctuary-js/sanctuary-type-classes#equals) ([source](https://github.com/sanctuary-js/sanctuary-type-classes/blob/v1.1.0/index.js#L912-L957)) to treat equivalent circular structures as equal. The current algorithm detects circular references to avoid recursing indefinitely, but it simply returns `false` as soon as a circular reference is detected. I'm hoping that someone can point me to a paper which describes a general algorithm which is applicable in a JavaScript context. – davidchambers Nov 16 '16 at 01:29
  • 1
    I misread your comment, @IlmariKaronen. I now understand the question you were asking, which is whether the depth at which the recursion begins is significant. I do not consider this significant. Two objects which represent the value `[1, [1, [1, [1, [1, ...]]]]]` should be considered equal even if their structures are not identical. – davidchambers Nov 16 '16 at 01:53
  • I'd say both the recursive type and the type of the wrapped values have to be setoids and additionally, the recursive type have to be foldable. What exactly do you mean with "circular"? You just show a linked list in your example. –  Nov 16 '16 at 19:27
  • Is a linked list with a recursive tail pointer not circular, @ftor? – davidchambers Nov 16 '16 at 21:55
  • Oh, I misinterpreted circular data structures, sorry. I personally prefer the term infinite. So your infinite list must be lazily foldable. And still it needs an `eq` method that takes the `eq` method of the inner values as an additional argument. Hence both the infinite list and the inner values have to be setoids. –  Nov 17 '16 at 07:40
  • We do the OO equivalent of what you describe, @ftor. Rather than take an equality function as an additional argument, each `fantasy-land/equals` method dispatches to the `fantasy-land/equals` methods of the inner values. – davidchambers Nov 18 '16 at 14:24

2 Answers2

5

A basic way to solve this problem is to note that if, during a recursive comparison, we ever end up again comparing the same pair of objects that we're already comparing, then we may simply assume that they're equal. This works because, if they're not equal after all, then the comparison that is already in progress will eventually find some difference between them.

Thus, we may simply start with a basic recursive comparison function, and add a stack of objects being currently compared:

function isEqual (a, b) {
    var stack = [];
    function _isEqual (a, b) {
        // console.log("->", stack.length);
        // handle some simple cases first
        if (a === b) return true;
        if (typeof(a) !== "object" || typeof(b) !== "object") return false;
        // XXX: typeof(null) === "object", but Object.getPrototypeOf(null) throws!
        if (a === null || b === null) return false;
        var proto = Object.getPrototypeOf(a);
        if (proto !== Object.getPrototypeOf(b)) return false;
        // assume that non-identical objects of unrecognized type are not equal
        // XXX: could add code here to properly compare e.g. Date objects
        if (proto !== Object.prototype && proto !== Array.prototype) return false;

        // check the stack before doing a recursive comparison
        for (var i = 0; i < stack.length; i++) {
            if (a === stack[i][0] && b === stack[i][1]) return true;
            // if (b === stack[i][0] && a === stack[i][1]) return true;
        }

        // do the objects even have the same keys?
        for (var prop in a) if (!(prop in b)) return false;
        for (var prop in b) if (!(prop in a)) return false;

        // nothing to do but recurse!
        stack.push([a, b]);
        for (var prop in a) {
            if (!(_isEqual(a[prop], b[prop]))) {
                stack.pop();
                return false;
            }
        }
        stack.pop();
        return true;
    }
    return _isEqual(a, b);
}

// TEST CASES:

var ones = [1]; ones[1] = ones;
var foo = [1]; foo[1] = [1, foo];
var bar = [1]; bar[1] = [1, ones];
console.log("ones == foo:", isEqual(ones, foo));
console.log("ones == bar:", isEqual(ones, bar));
console.log("foo == bar:", isEqual(foo, bar));

var obj = {}; obj["x"] = obj; obj["y"] = {obj};
console.log("obj == obj[x]:", isEqual(obj, obj["x"]));
console.log("obj != obj[y]:", !isEqual(obj, obj["y"]));

var seven = []; seven[0] = [[[[[[seven]]]]]];
var eleven = []; eleven[0] = [[[[[[[[[[eleven]]]]]]]]]];
console.log("seven == eleven:", isEqual(seven, eleven));
console.log("[seven] == [eleven]:", isEqual([seven], [eleven]));
console.log("[seven] == seven:", isEqual([seven], seven));
console.log("[seven] == [[[eleven]]]:", isEqual([seven], [[[eleven]]]));

Note that a lot of the complexity in the code above is due to it trying to accept and (more or less) gracefully handle any mixture of different kinds of JavaScript values, including primitives, null values, arrays, plain objects and all the other miscellaneous things a JS variable can contain. If you know your inputs can only contain a limited range of data types, then it may be possible to simplify this code considerably.

Ps. Due to the stack comparison, the runtime of this code can be up to O(nd), where n is the number of nodes in the trees that need to be compared (which can be more than expected; for example, comparing the objects seven and eleven in the snippet above takes 77 recursive calls) and d is the depth of the stack (which, in this case, also gets up to 77). In ES2015, a potentially useful optimization could be to use a Map of previously seen objects to reduce the stack lookup from O(d) to effectively O(1). If we expected the data structures being compared to generally have a lot of repetitive branches containing references to the same objects, it might even be useful to expand this into a general cache of previously seen objects that we've already found to be equal.

Ilmari Karonen
  • 49,047
  • 9
  • 93
  • 153
2

You can 'decycle' an object by iterating it and replacing already seen objects with "pointers" (=indexes in some array). Once you get decycled structs, simply serialize them and compare as strings:

let encode = function (x) {

    function _enc(x, lst) {
        if (typeof x !== 'object')
            return x;

        let i = lst.indexOf(x);
        if (i >= 0)
            return {'->': i};
        lst.push(x);

        let y = {};
        for (let k of Object.keys(x))
            y[k] = _enc(x[k], lst)

        return y;
    }

    return JSON.stringify(_enc(x, []));
};

//////

let ones = [1];
ones[1] = ones;

let ones_ = [1];
ones_[1] = ones_;

console.log(encode(ones) === encode(ones_))

// more interesting example

a = {
    b: {
        c: 123
    }
};

a.b.d = a;
a.x = [9, a.b];


a2 = {
    b: {
        c: 123
    }
};

a2.b.d = a2;
a2.x = [9, a2.b];

console.log(encode(a) === encode(a2))
georg
  • 211,518
  • 52
  • 313
  • 390
  • 1
    A clever idea. Alas, this code will not correctly identify my `var foo = [1]; foo[1] = [1, foo]` object as being identical to the OP's `var ones = [1]; ones[1] = ones`. (Also, another lesser problem is that it might give a false positive if the data structure contained an object with the single key `"->"` and an integer value.) – Ilmari Karonen Nov 16 '16 at 01:01
  • The fact that `encode([1, {'->': 0}])` and `encode(ones)` are equivalent despite `[1, {'->': 0}]` not being equivalent to `ones` is a problem. I'm looking for a general algorithm applicable to any pair of arrays. – davidchambers Nov 16 '16 at 01:40
  • @IlmariKaronen: an interesting question is how to formally define "equality" of such structures, e.g. `1->2->3->1...` vs `1->2->3->1->2->3->1...`. Like "two cycled graphs are considered 'equal' if..." what? – georg Nov 16 '16 at 11:50
  • 2
    @georg: I think a natural definition of equivalence for such graphs is [bisimulation](https://en.wikipedia.org/wiki/Bisimulation), which I believe the algorithm in my answer effectively computes. Basically, applied to nested data structures, it effectively says that two data structures are equivalent if any valid chain of indexes like `struct1[key1][key2]...[keyN]` for one of the structures is also valid for the other, and yields an equivalent value. – Ilmari Karonen Nov 16 '16 at 20:36