34

I have seen in an answer that the Set.has() method is O(1) and Array.indexOf() is O(n).

var a = [1, 2, 3, 4, 5];
a.indexOf(5);          


s = new Set(a);
s.has(5);              //Is this O(1)?

Is Set.has() really O(1) ?

Boann
  • 48,794
  • 16
  • 117
  • 146
Charlie
  • 22,886
  • 11
  • 59
  • 90
  • 2
    The spec requires the method to run in *sublinear* time. While `O(1)` complexity isn't *guaranteed*, IIRC it's quite likely to be what you'll run into on a normal browser environment, if the environment supports Sets at all. – CertainPerformance Mar 08 '19 at 06:28

2 Answers2

17

I don't think the array that has 5 elements is good case to check time complexity.

So based on @Shidersz's snippet, I made a new one that has many elements and invoked once.

Is Set.has() really O(1) ?

Yes. Time complexity of Set.has() is O(1) according to result of the test below.

const MAX = 10000000
let a = []
a.length = MAX

for (let i = 0; i < MAX; i++) {
  a[i] = i
}

let s = new Set(a)

let o = a.reduce((acc, e) => {
  acc[e] = e
  return acc
}, {})

console.time("Test_Array.IndexOf(0)\t")
a.indexOf(0);
console.timeEnd("Test_Array.IndexOf(0)\t")
console.time("Test_Array.IndexOf(n/2)\t")
a.indexOf(MAX / 2);
console.timeEnd("Test_Array.IndexOf(n/2)\t")
console.time("Test_Array.IndexOf(n)\t")
a.indexOf(MAX);
console.timeEnd("Test_Array.IndexOf(n)\t")

console.time("Test_Set.Has(0)\t\t")
s.has(0)
console.timeEnd("Test_Set.Has(0)\t\t")
console.time("Test_Set.Has(n/2)\t")
s.has(MAX / 2)
console.timeEnd("Test_Set.Has(n/2)\t")
console.time("Test_Set.Has(n)\t\t")
s.has(MAX)
console.timeEnd("Test_Set.Has(n)\t\t")

console.time("Test_Object[0]\t\t")
o[0]
console.timeEnd("Test_Object[0]\t\t")
console.time("Test_Object[n/2]\t")
o[MAX / 2]
console.timeEnd("Test_Object[n/2]\t")
console.time("Test_Object[n]\t\t")
o[MAX]
console.timeEnd("Test_Object[n]\t\t")
.as-console {
  background-color: black !important;
  color: lime;
}

.as-console-wrapper {
  max-height: 100% !important;
  top: 0;
}
zmag
  • 7,825
  • 12
  • 32
  • 42
  • Helpful, I'm sure you're right, but best to make a test that runs for a significant period of time - IIRC, time units less than 10 milliseconds aren't extraordinarily trustworthy. – CertainPerformance Mar 08 '19 at 06:30
  • 1
    I agree. but since the difference between two method running time is huge, I think that is good enough to show time complexity of those and I don't want to annoy SO users by long running snippet. – zmag Mar 08 '19 at 06:48
14

If one read the specification of has(), there is an algorithm describing it:

Algorithm for Set.prototype.has(value):

The following steps are taken:

  • Let S be the this value.
  • If Type(S) is not Object, throw a TypeError exception.
  • If S does not have a [[SetData]] internal slot, throw a TypeError exception.
  • Let entries be the List that is the value of S’s [[SetData]] internal slot.
  • Repeat for each e that is an element of entries,
    • If e is not empty and SameValueZero(e, value) is true, return true.
  • Return false.

And apparently, based on that algorithm and the presence of the word REPEAT one can have some confusion about it to be O(1) (we could think it could be O(n)). However, on the specification we can read that:

Set objects must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection.

Thanks to @CertainPerformance for pointing this.

So, we can create a test to compare Array.indexOf() and Set.has() in the worst case, i.e. look for an item that isn't in the array at all (thanks to @aquinas for pointing this test):

// Initialize array.
let a = [];

for (let i = 1; i < 500; i++)
{
    a.push(i);
}

// Initialize set.
let s = new Set(a);

// Initialize object.
let o = {};
a.forEach(x => o[x] = true);

// Test Array.indexOf().
console.time("Test_Array.indexOf()");
for (let i = 0; i <= 10000000; i++)
{
    a.indexOf(1000);
}
console.timeEnd("Test_Array.indexOf()");

// Test Set.has().
console.time("Test_Set.has()");
for (let i = 0; i <= 10000000; i++)
{
    s.has(1000);
}
console.timeEnd("Test_Set.has()");

// Test Object.hasOwnProperty().
console.time("Test_Object.hasOwnProperty()");
for (let i = 0; i <= 10000000; i++)
{
    o.hasOwnProperty(1000);
}
console.timeEnd("Test_Object.hasOwnProperty()");
.as-console {background-color:black !important; color:lime;}
.as-console-wrapper {max-height:100% !important; top:0;}

And now we can see that Set.has() performs better than Array.indexOf(). There is also an extra comparison to Object.hasOwnProperty() to take as reference.

Conclusion:

While O(1) complexity isn't guaranteed, the specification requires the method to run in sublinear time. And Set.has(), generally, will perform better than Array.indexOf().

Another Test:

On next example, we going to generate a random set of sample data and use it later to compare the differents methods.

// Generate a sample array of random items.

const getRandom = (min, max) =>
{
    return Math.floor(Math.random() * (max - min) + min);
}

let sample = Array.from({length: 10000000}, () => getRandom(0, 1000));

// Initialize array, set and object.

let a = [];

for (let i = 1; i <= 500; i++)
{
    a.push(i);
}

let s = new Set(a);
let o = {};
a.forEach(x => o[x] = true);

// Test Array.indexOf().

console.time("Test_Array.indexOf()");
for (let i = 0; i < sample.length; i++)
{
    a.indexOf(sample[i]);
}
console.timeEnd("Test_Array.indexOf()");

// Test Set.has().
console.time("Test_Set.has()");
for (let i = 0; i < sample.length; i++)
{
    s.has(sample[i]);
}
console.timeEnd("Test_Set.has()");

// Test Object.hasOwnProperty().
console.time("Test_Object.hasOwnProperty()");
for (let i = 0; i < sample.length; i++)
{
    o.hasOwnProperty(sample[i]);
}
console.timeEnd("Test_Object.hasOwnProperty()");
.as-console {background-color:black !important; color:lime;}
.as-console-wrapper {max-height:100% !important; top:0;}

Finally I want to apologize for the confusion the first version of my answer could cause. Thanks to all for giving me a better understanding of my mistakes.

Shidersz
  • 16,846
  • 2
  • 23
  • 48
  • I thought so too. – Charlie Mar 08 '19 at 05:57
  • 1
    Well, yes, `indexOf` performs better than a set when you have 5 elements in an array. Change your test to make an array of 1,000,000 items. Then, use the *worst* case test: look for an item that isn't in the array at all. On my machine, if I do that, and only 10,000 iterations of the loop, indexOf takes 12,374ms and the set takes 0.5ms. – aquinas Mar 08 '19 at 06:16
  • 1
    The specification also says: *Set objects must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection. The data structures used in this Set objects specification is only intended to describe the required observable semantics of Set objects.* – CertainPerformance Mar 08 '19 at 06:16
  • You both are right but it is not `O(1)` as I believed. On my machine there is a lot of difference between checking for truthy value on an object property and using `Set.has()`. Or maybe I'm missing something else... – Shidersz Mar 08 '19 at 06:32
  • It may have something to do with compiler optimization inside effectively no-op loops. [Be](https://stackoverflow.com/questions/54166875/why-are-two-calls-to-string-charcodeat-faster-than-having-one-with-another-one) [careful](https://mrale.ph/blog/2012/12/15/microbenchmarks-fairy-tale.html) when trying to draw conclusions from micro-benchmarks. – CertainPerformance Mar 08 '19 at 06:37
  • 1) You can't just call o[5], you need to call `hasOwnProperty` to see if a property exists. If you do that, (on my machine), `.has` is *BARELY* faster. YMMV slightly. 2) Just because "there is a lot of difference" between two algorithms doesn't mean one is O(1) and one isn't. They could both be O(1). Remember: it just means constant time regardless of the input size. One constant time algorithm could be 5 hours, and one could be 1 second. They're both O(1). – aquinas Mar 08 '19 at 08:13
  • @CertainPerformance I have updated the answer. Sorry for the misunderstanding of this, and thanks for your feedback. – Shidersz Mar 08 '19 at 16:47
  • 4
    When I run this test `a.indexOf()` performs 2x faster than `s.has()` (Chromium 81) – jchook Jun 20 '20 at 20:29
  • 2
    `indexOf` is outperforming `has` by like an order of magnitude when I run those (Chrome 85) – nog642 Nov 24 '20 at 08:39