7

Out of curiousity, I decided to confirm my belief that the Set implementation is faster than Object (running in google chrome) in terms of insertion/addition and lookup. My results were slightly confusing.

x = {};
y = new Set();

console.time('Insert into Object');
for (var i = 0; i < 1000000; i++) {
    x[i] = 0;
}
console.timeEnd('Insert into Object');

console.time('Insert into Set');
for (i = 0; i < 1000000; i++) {
    y.add(i);
}
console.timeEnd('Insert into Set');

var t = 0;

console.time('Retrieve from Object');
for (i = 0; i < 1000000; i++) {
    t = x[i];
}
console.timeEnd('Retrieve from Object');

console.time('Retrieve from Set');
for (i = 0; i < 1000000; i++) {
    t = y.has(i);
}
console.timeEnd('Retrieve from Set');

VM19742:9  Insert into Object: 1341.777ms
VM19742:15 Insert into Set: 1473.025ms
VM19742:23 Retrieve from Object: 1469.717ms
VM19742:29 Retrieve from Set: 1666.430ms

As you can see, the set performed slightly worse than the Object. This confuses me because I thought the underlying implementation would just be the same as the object without the extra overhead of storing the values. Does anyone have any extra insight as to why this is?

Chad Pendergast
  • 366
  • 1
  • 4
  • 11
  • 1
    Parts of your benchmark can be eliminated (like `t = x[i]`, which has no side effects and `t` is not used after), which may be skewing your results. You should put together a full benchmark and prevent optimizations, then test in a proper harness to get accurate samples and results. – ssube May 26 '16 at 15:12
  • When I run your code in Chrome, objects are almost 4 times faster than sets. Both are slower than arrays (Array retrieval is almost 5 times faster than objects, which in turn is almost 8 times faster than sets). – Ian McLaird May 26 '16 at 17:21
  • suspicion: it's because you're allowing the set to grow. – Ian McLaird May 26 '16 at 17:26
  • 1
    I'm not sure what you mean by "*without the extra overhead of storing the values*"? – Bergi May 26 '16 at 17:51

2 Answers2

5

To counter your question, why do you believe that storing a value is likely to be slower than checking the existing values to see if you need to store it?

Just to be complete, I modified your code to include Arrays, too.

x = {};
y = new Set();
z = [];

console.time('Insert into Object');
for (var i = 0; i < 1000000; i++) {
    x[i] = 0;
}
console.timeEnd('Insert into Object');

console.time('Insert into Set');
for (i = 0; i < 1000000; i++) {
    y.add(i);
}
console.timeEnd('Insert into Set');

console.time('Insert into Array');
for (i = 0; i < 1000000; i++) {
    z[i] = 0;
}
console.timeEnd('Insert into Array');

var t = 0;

console.time('Retrieve from Object');
for (i = 0; i < 1000000; i++) {
    t = x[i];
}
console.timeEnd('Retrieve from Object');

console.time('Retrieve from Set');
for (i = 0; i < 1000000; i++) {
    t = y.has(i);
}
console.timeEnd('Retrieve from Set');

console.time('Retrieve from Array');
for (i = 0; i < 1000000; i++) {
    t = z[i];
}
console.timeEnd('Retrieve from Array');

console.log(t);

Ordinarily, one would expect that storing into an object is O(1) time complexity while checking the existing values in a set should be no more than O(n) (where n is the number of items in the set), so we should expect that the set may become slower as it becomes larger. This performance may even be implementation-dependent, i.e. different javascript runtimes may behave differently.

And in fact, that's exactly what we see: (Running on my machine)

(your code)

Insert into Object: 67.558ms
Insert into Set: 259.841ms
Insert into Array: 64.297ms
Retrieve from Object: 19.337ms
Retrieve from Set: 149.968ms
Retrieve from Array: 3.981ms

But if we change your insertion tests to continually re-add the same values...

We see this:

Insert into Object: 19.103ms
Insert into Set: 40.645ms
Insert into Array: 16.384ms
Retrieve from Object: 40.116ms
Retrieve from Set: 30.672ms
Retrieve from Array: 70.050ms

Which tells us a couple of things. First, different types are differently sensitive to the size of the collection. Secondly, retrieving non-existent keys from either an array or an object is slower than checking a set for the same value (at least on this particular implementation).

As a final note

I think it's important to note here that we're discussing the relative performance of things that take well under a second for a million iterations (or about a second-and-a-half on your test). That's pretty fast already. For those reading this in the future, remember to choose the data structure that's semantically the one you want. Any of these three is a perfectly good choice from a performance perspective. Choose the one that makes your program the most understandable.

Ian McLaird
  • 5,507
  • 2
  • 22
  • 31
  • 2
    "*checking the existing values in a set should be O(n)*" - no, [they must be better than that](http://stackoverflow.com/a/31092145/1048572). V8 [even has `O(1)`](http://stackoverflow.com/q/33611509/1048572). – Bergi May 26 '16 at 17:54
  • Did you also change the object and array insertions to only contain the `0` item in that last benchmark? – Bergi May 26 '16 at 17:55
  • @Bergi, I didn't need to, because they were already that way in the OP's code. Only the `Set` was being given different values. – Ian McLaird May 26 '16 at 18:02
  • But you are correct that the `Set` must be sub-linear complexity. I'll update the answer to reflect that. On the other hand, that still leaves the mystery of why adding the same value over and over again is so much faster than adding different values. It would seem that it may be sub-linear, but still more than O(1). – Ian McLaird May 26 '16 at 18:05
  • No, they're not - he just uses `0` as a flag value. He could've used `1` or `true` or anything else as well. You need to change the code to `x[0] = …` and `z[0] = …` to "insert" the value `0`. – Bergi May 26 '16 at 18:08
  • 1
    Oh, I'm sorry. I misunderstood. You're right. The object and the array are also growing here. I'll re-benchmark that. – Ian McLaird May 26 '16 at 18:09
  • "effectively `O(1)`" doesn't mean that the time must always be the same or must not grow with the set size, just that it needs to stay below a certain constant. – Bergi May 26 '16 at 18:10
0

The test case is a bit flawed.

  1. Because of the consistent usage of integer keys, the object will probably be optimized to an array.
  2. The lookup in the object would more properly use x.hasOwnProperty(key) or x[key]!==undefined.
Useless Code
  • 12,123
  • 5
  • 35
  • 40
Elias Hasle
  • 637
  • 7
  • 15