4

Recently, I had to work on optimising a task that involved the creation of really large arrays (~ 10⁸ elements).

I tested a few different methods, and, according to jsperf, the following option seemed to be the fastest.

var max = 10000000;
var arr = new Array(max);
for (let i = 0; i < max; i++) {
  arr[i] = true;
}

Which was ~ 85% faster than

var max = 10000000;
var arr = [];
for (let i = 0; i < max; i++) {
  arr.push(true);
}

And indeed, the first snippet was much faster in my actual app as well.

However, my understanding was that the V8 engine was able to perform optimised operations on array with PACKED_SMI_ELEMENTS elements kind, as opposed to arrays of HOLEY_ELEMENTS.

So my question is the following:

  • if it's true that new Array(n) creates an array that's internally marked with HOLEY_ELEMENTS, (which I believe is true) and
  • if it's true that [] creates an array that's internally marked with PACKED_SMI_ELEMENTS (which I'm not too sure is true)

why is the first snippet faster than the second one?

Related questions I've been through:

bugs
  • 14,631
  • 5
  • 48
  • 52
  • 3
    IIRC packed vs holey isn't that much of a performance difference, it's just an internal marker that the array needs to be treated differently (especially given that many things changed in V8 since that talk). Also, the performance difference should be observed in *using* the array, not in creating and initialising it. What exactly did you benchmark, your whole app or only the two snippets you posted? – Bergi Feb 01 '19 at 14:59
  • @Bergi thanks, do you have any link to some documentation about this? In terms of benchmark, I measured the performance of a function that involved the creation and the usage of the array (the latter being the same in both cases) – bugs Feb 01 '19 at 15:03
  • 1
    I found the link to one of @jmrk's answers: https://stackoverflow.com/a/53161007/1048572 (see the comments for discussion of holey vs packed). Not sure whether that question is duplicate-worthy. – Bergi Feb 01 '19 at 15:08

1 Answers1

10

V8 developer here. The first snippet is faster because new Array(max) informs V8 how big you want the array to be, so it can allocate an array of the right size immediately; whereas in the second snippet with []/.push(), the array starts at zero capacity and has to be grown several times, which includes copying its existing elements to a new backing store.

https://www.youtube.com/watch?v=m9cTaYI95Zc is a good presentation but probably should have made it clearer how small the performance difference between packed and holey elements is, and how little you should worry about it.

In short: whenever you know how big you need an array to be, it makes sense to use new Array(n) to preallocate it to that size. When you don't know in advance how large it's going to be in the end, then start with an empty array (using [] or new Array() or new Array(0), doesn't matter) and grow it as needed (using a.push(...) or a[a.length] = ..., doesn't matter).

Side note: your "for loop with new Array() and push" benchmark creates an array that's twice as big as you want.

jmrk
  • 34,271
  • 7
  • 59
  • 74
  • It's great to get an answer directly from a V8 developer, thanks for the clarification. Is there any article / documentation I can have a look at regarding all this? And thanks for pointing out the mistake in jsperf, small oversight from my side – bugs Feb 01 '19 at 19:05
  • @bugs: Blog and docs on http://v8.dev contain lots of information, and then there are various videos of technical presentations on YouTube, at least one of which you've found already ;-) – jmrk Feb 01 '19 at 22:58
  • 1
    @vsemozhetbyt: Correct, that's a relevant feature request; it has quite low priority because for most real-world code the distinction doesn't matter. – jmrk Feb 01 '19 at 22:58
  • @jmrk `new Array(n)` will also leave your array in HOLEY mode though. Isn't that seen as bad practice? – snek Feb 03 '19 at 14:59
  • @snek: Using `new Array(n)` is perfectly fine, it is not a bad practice. HOLEY mode is nothing to worry about (see the second paragraph of my answer). See it as "for the curious, here's one of many tricks that V8 performs under the hood to squeeze as much performance as possible out of any code you throw at it". Don't see it as something that you must be aware of and avoid. – jmrk Feb 04 '19 at 19:25
  • @jmrk i have few questions. First: with new Array(n) there is no type precised, how v8 can allocate space? What about the saying that holey arrays will not convert to contiguous arrays (specially if big), But to a dictionary which have a more slower access (additional lockups steps)? For smaller sizes is it contiguous or always a dictionary? Using Holey improve the allocation (as y answered), but for a slower access and operations. PACKED have better operations and access performance, vs slow reallocation if the array grow. Are typed arrays the solution (Not HOLEY + Contiguous)? – Mohamed Allal Jun 15 '19 at 09:01
  • (1) Each entry is one pointer big, always. (2) Dictionary mode is entirely separate from "holey". (3) "Holey" is always contiguous. (4) Don't worry about the access speed of "holey" vs "packed", it is almost never measurable in practice. (5) TypedArrays come with a different set of tradeoffs, so use them when appropriate, not always. (6) Generally, write code that makes sense, let the engine worry about making it fast. (7) Next time, please ask a separate question; comments are too short for this discussion. – jmrk Jun 16 '19 at 12:22