4

I'm doing some sanity testing on a personal project before updating some state to use ImmutableJS. I have a small test I wrote to make sure that Immutable.List.equals performs like I expect it should -- O(1).

https://github.com/naddeoa/immutablejs-slow-equals.git

The important part is below

function compareImmutableMany(l1, l2) {
  let result = false;
  for (let i = 0; i < iterations; i++) {
    result = l1.equals(l2);
  }
  return result;
}

i1 = createImmutableList();
i2 = createImmutableList();
console.log("immutable lists are equal", compareImmutableMany(i1, i2));

The test is comparing two native js lists of size 100,000 and then two Immutable.List of size 100,000, each 1000 times. I must be missing something. I'm seeing the Immutable.List sample perform very bad relative to the native list sample.

starting
immutable lists are equal true
Seconds: 11.423
starting
normal lists are equal: true
Seconds: 0.109

You can run this locally with the following.

git clone https://github.com/naddeoa/immutablejs-slow-equals.git
cd immutablejs-slow-equals
npm i
node main.js

If I'm making a simple mistake then I'd appreciate some eyes to let me know where it is. I definitely expected this to be very fast. Especially because replacing l1.equals(l2) with a call to l1.hashCode() is really fast.

Anthony Naddeo
  • 2,497
  • 25
  • 28
  • I'm curious why you expected it to be instant? While the Lists are Immutable, you're still creating two instances of Immutable lists. I don't believe the implementation underneath will reuse a list if it determines the values to be the same, which would be the only case I can see it being instant. It has to compare every value with every other value. – Andrew T Finnell Dec 03 '18 at 22:24
  • The time isn't spent creating lists, its spent doing the equality checking. I assumed that equality was defined has having equal `.hashCode()` output, but another commenter just implied it isn't the case. If it was the case then it would be pretty much instant. That doesn't answer all of my questions either unfortunately. In addition, it wouldn't say much about why its so much slower than a native list doing the same thing. – Anthony Naddeo Dec 03 '18 at 23:14
  • You are correct, equality checks should NEVER rely on hashCode(). .hashCode() is NOT a unique number but rather a best guess so that it can fit into a 'bucket' for sorting. Never use hashCode() for equality. With proper .hashCode() one _should_ be able to use it for non-equality, just not equality. – Andrew T Finnell Dec 04 '18 at 18:29

2 Answers2

2

According to the docs:

Two immutable collections are considered value equal (via .equals() or is()) if they represent the same collection of values. This differs from JavaScript's typical reference equal (via === or ==) for Objects and Arrays which only determines if two variables represent references to the same object instance.

This means that it can't be O(1) because it needs to check the equality of all the values in the list. This is going to be substantially slower (as you have found).

The performance tradeoff is also documented:

When comparing two collections, value equality may require considering every item in each collection, on an O(N) time complexity. For large collections of values, this could become a costly operation. Though if the two are not equal and hardly similar, the inequality is determined very quickly. In contrast, when comparing two collections with reference equality, only the initial references to memory need to be compared which is not based on the size of the collections, which has an O(1) time complexity. Checking reference equality is always very fast, however just because two collections are not reference-equal does not rule out the possibility that they may be value-equal.

Mark
  • 90,562
  • 7
  • 108
  • 148
  • 1
    Something is still missing here. Both l1 and l2 have the same `hashCode()` values, despite being different lists. I expected that to be the extent of the equality check. Isn't this the value proposition? What is the expectation for comparing large collections for value equality? – Anthony Naddeo Dec 03 '18 at 22:27
  • I agree it can be confusing @AnthonyNaddeo — how it hashes non-immutable objects might be the source of the issue. That's discussed [here](https://facebook.github.io/immutable-js/docs/#/hash) – Mark Dec 03 '18 at 22:32
  • Hmm, I'm not sure I follow. There are only immutable objects in this scenario and they already have the same hashcode as each other. If that isn't sufficient then what is the expected way to get O(1) equality checks with ImmutableJS? – Anthony Naddeo Dec 03 '18 at 22:37
  • Sorry, @AnthonyNaddeo I thought your IM object had other nested objects. But in answer to your question, no it is not sufficient because of the possibility of hash collisions. As the docs say: "If two values have the same hashCode, they are not guaranteed to be equal." I'm not sure you *can* get O(1) equality checks. – Mark Dec 03 '18 at 22:47
  • Hmm, but that raises another huge question. Isn't that half of the value prop here? I can appreciate immutability for its own sake, but being able to tell if there is a change between two immutable objects was the entire pitch for using this with React. – Anthony Naddeo Dec 03 '18 at 22:51
  • @AnthonyNaddeo Perhaps it's more important to know when two objects *aren't* equal so you can tell if something has changed. While the same hash doesn't guarantee equality, different hashes guarantee inequality. There's a good [discussion here](https://stackoverflow.com/questions/34385243/why-is-immutability-so-important-or-needed-in-javascript) about the value proposition. – Mark Dec 03 '18 at 22:56
  • 1
    Does that mean that the expected implementation of a `shouldComponentUpdate()` method is actually `nextProps.foo.hashCode() !== this.props.foo.hashCode()`, and that `.equals` is effectively never used for this purpose? – Anthony Naddeo Dec 03 '18 at 23:03
  • It looks like the authors intended that you use reference equality combined with the update APIs that return the same object when an update results in no change. Which means you get different equality checks each time you replace an immutable object with another one I guess. Perhaps just a JS trade off... better than nothing. That said, its as vulnerable to hash issues as just using hash directly. – Anthony Naddeo Dec 03 '18 at 23:10
  • .hashCode() should not be used for `==` checks. Two different values can produce the exact same `.hashCode()` in theory. The same values must always produce the same hashCode(), but it does not mean different values cannot also produce that hashCode(). It can only be used for `!==` checking. For example, hypothetically: `'A'.hashCode() == '10'`, `'B'.hashCode() == '10'`, `'C'.hashCode() == '11'`. Given the variables `var obj1 = 'A', var obj2 = 'B', obj3 = 'C'`. You cannot `obj1.hashCode() == obj2.hashCode()` for what I believe is obvious reasons. – Andrew T Finnell Dec 04 '18 at 18:32
0

I think this could be classified as user error on my part. I was used to immutability in languages where it's a first class citizen and it didn't map will onto JS with ImmutableJS. My mistakes were (at least) as follows.

First, .equals() is a deep equal by design and it isn't what you should be relying on to implement performant equality checks in something like a shouldComponentUpdate call in React.

Second, to get the most out of ImmutableJS, you need to use its API correctly. Here is an example.

const Foo = Immutable.Record({a:1})

const foo1 = Foo({a:2})
const foo2 = Foo({a:2})

const cache = Immutable.Map({'key': foo1})

// Doing it this way will force you to either use .equals() or .hashCode() 
// to check for changes later
const wrongWay = cache.set('key', foo2)
console.log('This is false', wrongWay === cache) 
console.log('This is true', wrongWay.hashCode() === cache.hashCode()) 
console.log('This is true and slow', wrongWay.equals(cache))  

// Doing it this way will let you rely in reference equality
const rightWay = cache.mergeDeep({'key': foo2})
console.log('This is true', rightWay === cache) 
console.log('This is true, but not needed', rightWay.hashCode() === cache.hashCode()) 
console.log('This is true and slow, but not needed', rightWay.equals(cache))  

Third, when using this with React with the intention of using PureComponent which does a shallow, reference equality check, you need to make sure that you're going to end up with reference equality (like in the example above).

Finally, sometimes you need to execute expensive functions on your app state in React just to be able to get the props for your component. In order to avoid doing that for no reason you can use something like Lodash's memoize in combination with reference equality on immutable objects like in the following contrived example.

const myFunction = _.memoize((immutableAppState) => {
    return immutableAppState.immutableList.map(/* really expensive stuff returning new immutable objects based on other selectors */)
}, (immutableAppState) => immutableAppState.specificRecordThatChanges )

If someone spots some errors here (or missing errors) then please point them out and I'll update the explanation.

Anthony Naddeo
  • 2,497
  • 25
  • 28
  • Yes, there is an error. You cannot use `wrongWay.hashCode() === cache.hashCode()` to check for changes as hashCode() is NOT guranteed to be unique across different values. It is **perfectly valid** for an implementation of `.hashCode()` to do this: `function hashCode() { return 1; }` – Andrew T Finnell Dec 04 '18 at 18:37
  • I'm not saying that you should do that, I'm just saying that in these examples it will be the same because the values of the two immutable objects are identical. I suppose I can make that more explicit. – Anthony Naddeo Dec 04 '18 at 20:21
  • That said, I should change the memo example to _not_ use hashCode() as the key to the memo, since those can collide too and lodash doesn't seem to offer the hooks to use a .equals() to disambiguate between collisions. – Anthony Naddeo Dec 04 '18 at 20:28