27

According to the fundamentals of CS the search functionality of an unsorted list has to occur in O(n) time where as direct access into an array will occur in O(1) time for HashMaps.

So is it more performant to map an array into a dictionary and then access the element directly or should I just use includes? This question is specifically for JavaScript because I believe this would come down to core implementation details of how includes() and {} is implemented.

let y = [1,2,3,4,5]
y.includes(3)

or...

let y = {
          1: true,
          2: true
          3: true
          4: true
          5: true
        }
5 in y
Luke Xu
  • 2,302
  • 3
  • 19
  • 43

3 Answers3

31

It's true that object lookup occurs in constant time - O(1) - so using object properties instead of an array is one option, but if you're just trying to check whether a value is included in a collection, it would be more appropriate to use a Set, which is a (generally unordered) collection of values, which can also be looked up in linear time. (Using a plain object instead would require you to have values in addition to your keys, which you don't care about - so, use a Set instead.)

const set = new Set(['foo', 'bar']);
console.log(set.has('foo'));
console.log(set.has('baz'));

This will be useful when you have to look up multiple values for the same Set. But, adding items to the Set (just like adding properties to an object) is O(N), so if you're just going to look up a single value, once, there's no benefit to this nor the object technique, and you may as well just use an array includes test.

Alisson Reinaldo Silva
  • 10,009
  • 5
  • 65
  • 83
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
  • Would you know how a set is implemented? I didn't know Sets had linear time lookup. My guess from that assumption is that it's implemented as a hashmap behind the scenes but without the satellite data. – Luke Xu Nov 22 '18 at 04:35
  • 2
    See https://www.ecma-international.org/ecma-262/6.0/#sec-set-objects - "Set objects must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection.". It's generally [`O(1)`](https://stackoverflow.com/questions/33611509/es6-map-and-set-complexity-v8-implementation) – CertainPerformance Nov 22 '18 at 04:38
  • @CertainPerformance just wanna ask isn`t addition of element in set takes O(1) time. – amrender singh Nov 22 '18 at 05:09
  • @amrendersingh Yes, it is - *adding* a property is a predictable, simple operation (significantly less complex than looking through a data structure) – CertainPerformance Nov 22 '18 at 05:15
  • @CertainPerformance "But, adding items to the Set (just like adding properties to an object) is O(N)" here you are referring to adding multiple elements to set iteratively? – amrender singh Nov 22 '18 at 05:17
  • @amrendersingh Exactly, eg `const set = new Set(arr);` is very likely `O(N)` where `N` is the `length` of `arr`. – CertainPerformance Nov 22 '18 at 05:18
13

Updated 04/29/2020

As the commenter rightly pointed out it would seem V8 was optimizing out the array includes calls. An updated version that assigns to a var and uses it produces more expected results. In that case Object address is fastest, followed by Set has and in a distant third is Array includes (on my system / browser).

All the same, I do stand by my original point, that if making micro-optimizations it is worth testing assumptions. Just make sure your tests are valid ;)

Original

Well. Despite the obvious expectation that Object address and Set has should outperform Array includes, benchmarks against Chrome indicate that implementation trumps expectation.

In the benches I ran against Chrome Array includes was far and away the best performer.

I also tested locally with Node and got more expected results. In that Object address wins, followed closely by Set has, then Array includes was marginally slower than both.

Bottom line is, if you're making micro-optimizations (not recommending that) it's worth benchmarking rather than assuming which might be best for your particular case. Ultimately it comes down to the implementation, as your question implies. So optimizing for the target platform is key.

Here's the results I got:

Node (12.6.0):

ops for Object address 7804199
ops for Array includes 5200197
ops for Set has        7178483

Chrome (75.0):
https://jsbench.me/myjyq4ixs1/1

benchmark against Chrome

andrew.carpenter
  • 516
  • 2
  • 6
  • 14
  • 11
    Your testcases are with very small sets of items, and you don't use the result of the check, so I bet the JS JIT just optimizes the accesses away in the array.includes case. If you use larger sample sizes and use the results of the check, the Set has operation is more than 40-100 times faster than Array.includes, for sets of strings (tested on Chrome and Firefox on macOS). https://jsbench.me/t4k9lk7h7j/1 - even this version has issues because we're not testing how quick negative lookups (ie items not in the list) are. – Gijs Apr 29 '20 at 16:38
  • The update version **3k** elements in array **array.includes** vs **set.has** vs **obj.prop** Fastest to slowest: obj.prop, set.has, array.includes https://jsbench.me/kwkyix6rpf/1 – nemanjabu Jan 17 '22 at 16:48
0

This isn't necessarily a direct answer to the question but here is a related performance test I ran real quick in my chrome dev tools

function getRandomInt(max) {
    return Math.floor(Math.random() * max);
}
var arr = [1,2,3];
var t = performance.now();
for (var i = 0; i < 100000; i++) {
    var x = arr.includes(getRandomInt(3));
}
console.log(performance.now() - t);
var t = performance.now();
for (var i = 0; i < 100000; i++) {
    var n = getRandomInt(3);
    var x = n == 1 || n == 2 || n == 3;
}
console.log(performance.now() - t);
VM44:9 9.100000001490116
VM44:16 5.699999995529652

I find the array includes syntax nice to look at, so I wanted to know if the performance was likely to be an issue the way I use it, for checking if a variable is one of a set of enums for instance. It doesn't seem to be much of an impact for situations like this with a short list. Then I ran this.

function getRandomInt(max) {
    return Math.floor(Math.random() * max);
}
var t = performance.now();
for (var i = 0; i < 100000; i++) {
    var x = [1,2,3].includes(getRandomInt(3));
}
console.log(performance.now() - t);

var t = performance.now();
for (var i = 0; i < 100000; i++) {
    var n = getRandomInt(3);
    var x = n == 1 || n == 2 || n == 3;
}
console.log(performance.now() - t);
VM83:8 12.600000001490116
VM83:15 4.399999998509884

and so the way I actually use it and like lookin at it is quite worse with performance, despite still not being very significant unless run a few million times, so using it inside of an Array.filter that may run a lot as a react redux selector may not be a great idea like I was about to do when I decided to test this.

Kyle Zimmer
  • 300
  • 2
  • 7