142

What time complexity (in big-O notation) is provided by the ES6 specification for the Keyed Collections (Set, Map, WeakSet, and WeakMap)?

My expectation, and I expect that of most developers, is that the specifications and implementations would use widely accepted performant algorithms, in which case Set.prototype.has, add and delete to all be O(1) in the average case. The same for the Map and Weak– equivalents.

It is not entirely apparent to me whether the time complexity of the implementations was mandated e.g. in ECMAScript 2015 Language Specification - 6th Edition — 23.2 Set Objects.

Unless I misunderstand it (and it's certainly very possible I do), it looks the ECMA spec mandates that the implementations (e.g. Set.prototype.has) are to use a linear time (O(n)) algorithm. It would strike me as exceedingly surprising that more performant algorithms would not be mandated or even permitted by the spec, and I would be very interested in an explanation for why this is the case.

Brian M. Hunt
  • 81,008
  • 74
  • 230
  • 343
  • 2
    All *O(1)* algorithms are *O(n)* as well, so letting in spec less performant implementations does not make any harm. Probably the less performant implementations may be of some interest in systems with constrained resources, as most likely they require much less code / space. – artur grzesiak Jun 27 '15 at 18:16
  • 2
    @arturgrzesiak The O(1) performance of the keyed collections is generally achieved with an O(1) hash plus a O(n) collision bucket. The O(n) worst-case is for most practical purposes astronomically rare. The space complexity of these techniques is generally O(n). – Brian M. Hunt Jun 27 '15 at 18:35
  • 2
    "Set objects must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection. " -- from that very page. – georg Jun 27 '15 at 18:37

3 Answers3

86

Right from that very paragraph your linked to:

Set objects must be implemented using [mechanisms] that, on average, provide access times that are sublinear on the number of elements in the collection.

You will find the same sentence for Maps, WeakMaps and WeakSets.

It looks the ECMA spec mandates that the implementations (e.g. Set.prototype.has) are to use a linear time (O(n)) algorithm.

No:

The data structures used in this Set objects specification is only intended to describe the required observable semantics of Set objects. It is not intended to be a viable implementation model.

The observable semantics are mostly related to the predictable iteration order (which still can be implemented efficient and fast). It is indeed expected by the specification that an implementation uses a hash table or something similar with constant access, though trees (with logarithmic access complexity) are allowed as well.

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • 8
    Thanks for picking that out. My eyes must've glazed over by the time I got to that paragraph. :) So algorithms that are either O(log(n)) or O(1), but not otherwise mandated (provided they are under O(n))? – Brian M. Hunt Jun 27 '15 at 18:37
  • 3
    @BrianM.Hunt: Correct. – Bergi Jun 27 '15 at 18:38
  • WTH? Aren't they O(log(n))?????? Are they really O(n) on average time? Is O(n) just the worst case? – canbax Feb 24 '23 at 09:05
  • 2
    @canbax Please see the first quote in my answer again. It does not permit linear average time complexity. – Bergi Feb 24 '23 at 09:40
55

For anyone who is curious, I did a very quick benchmark:

const benchmarkMap = size => {
  console.time('benchmarkMap');
  var map = new Map();
  for (var i = 0; i < size; i++) map.set(i, i);
  for (var i = 0; i < size; i++) var x = map.get(i);
  console.timeEnd('benchmarkMap');
}

const benchmarkObj = size => {
  console.time('benchmarkObj');
  var obj = {};
  for (var i = 0; i < size; i++) obj[i] = i;
  for (var i = 0; i < size; i++) var x = obj[i];
  console.timeEnd('benchmarkObj');
}

var size = 1000000;

benchmarkMap(size);
benchmarkObj(size);

I ran this a few times and yielded the following results:

(2017 MacBook Pro, 2.5 GHz i7 w/ 16G RAM)

benchmarkMap: 189.120ms
benchmarkObj: 44.214ms

benchmarkMap: 200.817ms
benchmarkObj: 38.963ms

benchmarkMap: 187.968ms
benchmarkObj: 41.633ms

benchmarkMap: 186.533ms
benchmarkObj: 35.850ms

benchmarkMap: 187.339ms
benchmarkObj: 44.515ms
domdambrogia
  • 2,054
  • 24
  • 31
  • 3
    @domdambrogia if you separate out setting from getting I get: Map Set = 124, Map Get = 40, Object Set = 26, Object Get = 1 (these are ratios not ms) – AJP Aug 07 '19 at 08:22
  • 1
    @AJP I didn't think about, breaking it down with those stats as well. Thanks for your input, that's a good contribution. I'll see if i can add that into my answer when I have a second. Thanks! – domdambrogia Aug 07 '19 at 15:12
  • 1
    It would be interesting to separate the assignment from the reading to also learn which one of both is faster for reading. – fernandopasik Dec 08 '19 at 19:53
  • 26
    "*2017 MacBook Pro, 2.5 GHz i7 w/ 16G RAM*" - uh, that's cool and all, but **which javascript engine** did you benchmark? – Bergi Jun 12 '20 at 20:12
  • 4
    Interestingly enough, when adding `delete` operations and operations are intermixed, `Map` performs much better. https://jsfiddle.net/23hrp0eq/ – Jorjon Aug 30 '20 at 20:08
  • @Bergi can't MacBooks only use JavaScriptCore - the Safari engine? – beqa Feb 04 '22 at 22:00
  • 1
    @beqa I think it's iOS where Apple doesn't let you install your own software. On macOS you can install anything you want, including e.g. nodejs with v8 or arbitrary browsers. Or even compile them yourself – Bergi Feb 04 '22 at 23:53
  • to be fairer why don't you repeat the experiment 10 times and get the average? Also I think the experiment should use `Object.hasOwnProperty` to check the existence of a property – canbax Feb 24 '23 at 09:02
  • 1
    @canbax I wrote this in 5 minutes 4 years ago. It was a quick proof of concept to benchmark. If you're interested in your own questions you should be able to extend this rather easily to answer them. The whole point of my answer was to provide a minimal reproducible example: https://stackoverflow.com/help/minimal-reproducible-example – domdambrogia Feb 24 '23 at 20:02
  • If you add `const str = 'justString'` to your code and than `map.set(str + i, i); ... var x = map.get(str + i); ... obj[str + i] = i; ... var x = obj[str + i];` the `Map` is faster. – alexdf Jul 11 '23 at 20:11
  • @alexdf Interesting point. Are you implying using a string key rather than a numeric key will boast better performance? Did you compare these to your own benchmarks or my previous benchmarks? – domdambrogia Jul 12 '23 at 15:14
24

The question Is the Set.has() method O(1) and Array.indexOf O(n)? is listed as a duplicate of this one, which it isn't exactly (I've voted to re-open). I'll add these benchmarks here anyway, as the benchmarks on the replies to that question fail to show the full range of differences in performance between Set#has and Array#indexOf.

The following is all true for Chrome 93:

You find that for smaller datasets, Array#indexOf actually outperforms Set#has or Map#has; however, for larger datasets, Set#has and Map#has are multiple orders of magnitude faster. Which is pretty consistent with what you'd expect for O(n) vs O(1) operations.

Interestingly, despite both being O(n), Array#includes is way slower than Array#indexOf for a small dataset, but performs very similarly for large datasets. Presumably, Array#indexOf takes advantage of some optimization that Array#includes doesn't.

Meanwhile, Object#hasOwnProperty slightly outperforms Set#has and Map#has in all cases (at least in Chrome 93).

Benchmarking code

const [small, medium, large] = [1e3, 1e5, 1e7]

const configs = [
    { size: small, iterations: large },
    { size: medium, iterations: medium },
    { size: large, iterations: small },
]

for (const { size, iterations } of configs) {
    const arr = Array.from({ length: size }, (_, i) => String(i))
    const obj = Object.fromEntries(arr.map(k => [k, true]))
    const set = new Set(arr)
    const map = new Map(Object.entries(obj))

    const valsToTest = Array.from(
        { length: iterations },
        (_, i) => String(Math.floor(Math.random() * size)),
    )

    const title = `dataset size: ${size.toLocaleString()}; iterations: ${iterations.toLocaleString()}`

    console.log(`\n-> ${title}`)

    for (const [target, method] of [
        [arr, 'indexOf'],
        [arr, 'includes'],
        [set, 'has'],
        [map, 'has'],
        [obj, 'hasOwnProperty'],
    ]) {
        const subtitle = `${target.constructor.name}#${method}`

        console.time(subtitle)

        for (const val of valsToTest) {
            target[method](val)
        }

        console.timeEnd(subtitle)
    }
}

My results (Chrome 93)


-> dataset size: 1,000; iterations: 10,000,000
Array#indexOf: 185.100ms
Array#includes: 11302.700ms
Set#has: 367.400ms
Map#has: 375.000ms
Object#hasOwnProperty: 252.800ms

-> dataset size: 100,000; iterations: 100,000
Array#indexOf: 10794.100ms
Array#includes: 10476.800ms
Set#has: 6.600ms
Map#has: 6.800ms
Object#hasOwnProperty: 1.900ms

-> dataset size: 10,000,000; iterations: 1,000
Array#indexOf: 12798.900ms
Array#includes: 12435.400ms
Set#has: 0.800ms
Map#has: 0.800ms
Object#hasOwnProperty: 0.300ms
Lionel Rowe
  • 5,164
  • 1
  • 14
  • 27
  • The question you linked does ask "*Is Set.has() really O(1)?*", which definitely is a duplicate of the canonical. – Bergi Sep 11 '21 at 14:50
  • 2
    @Bergi It asks for a comparison between `Set#has` and `Array#indexOf`, whereas this question doesn't. If I Google `set.has vs array.indexof time complexity`, that question is the first result. My answer here is a benchmark-based (as opposed to spec-based) answer to that question, with a few other comparisons thrown in for good measure. Regardless of whether the question is considered a duplicate or not, hopefully someone finds this answer useful. – Lionel Rowe Sep 11 '21 at 15:09
  • what about Set#delete ? – canbax Feb 24 '23 at 08:59