10

I'm getting lost in the weeds with V8 source as well as articles on the subject and I came across a blog post which stated:

If you are forced to fill up an array with heterogeneous elements, let V8 know early on by using an array literal especially with fixed-size small arrays.

let array = [77, 88, 0.5, true]; //V8 knows to not allocate multiple times.

If this is true, then why is it true? Why an array literal? What's so special about that vs creating an array via a constructor? Being new to the V8 source, it's difficult to track-down where the difference in homogeneous/heterogeneous arrays lie.

Also, if an answerer can point me towards the relevant V8 source, that'd be appreciated.

EDIT: slight clarification on my question (array literal vs. array constructor)

Matt
  • 3,508
  • 6
  • 38
  • 66
  • `heterogeneous` elements are essentially elements that aren't of the same underlying data type – Derek Pollard Jun 20 '18 at 03:23
  • 1
    https://www.html5rocks.com/en/tutorials/speed/v8/ - it includes the exact same example you show. Always remember, "premature optimization is the root of all evil" (Knuth). Also, you'll be optimizing explicitly for V8, which does have 60%+ market share, but still is not everything. – ASDFGerte Jun 20 '18 at 03:24
  • The V8 engine goes through various optimizations and restructuring when you add in elements of new data types and that, inherently, costs performance time. Albeit, they may be micro-optimizations, but nonetheless, they cost – Derek Pollard Jun 20 '18 at 03:24
  • Also, this talk is probably one of the best ones that goes into depth on the type optimizations, and its given by one of the V8 developers: https://www.youtube.com/watch?v=EhpmNyR2Za0 – Derek Pollard Jun 20 '18 at 03:26
  • 1
    @Derek I get the disparate data types, but what I don't understand is why an array literal is necessary to "let v8 know" vs, say, a constructor'd array. (Thanks for the links, btw) – Matt Jun 20 '18 at 03:29
  • Your intuition is right that regarding the types of elements, `[1, 0.5, true]` is just as good as `Array(1, 0.5, true)`. Aside from that, however, V8 has special optimizations for literals (because they're so common!), so for many cases the literal will have a slight benefit. But don't worry about any of that too much, just write the code that makes sense to you and let V8 worry about the rest. – jmrk Jun 20 '18 at 04:58
  • @jmrk What (or where) are those optimizations? I know this might be overly pedantic, but I’m genuinely curious about the differences. – Matt Jun 20 '18 at 06:00
  • Literals use "boilerplate" objects that can be quickly copied to make instantiation of objects faster. For Array literals, in some cases even that is skipped and you get a copy-on-write array first, i.e. the actual copy is only created once you modify the array. I prefer not to be more specific because (1) developers reading this shouldn't over-optimize for internal details and (2) these details tend to change from time to time (as new optimizations are developed, not-so-useful optimizations are dropped to simplify things, heuristics are tuned, etc.). – jmrk Jun 20 '18 at 17:47

1 Answers1

7

From this blog post provided by Mathias, a V8 developer:

Common elements kinds

While running JavaScript code, V8 keeps track of what kind of elements each array contains. This information allows V8 to optimize any operations on the array specifically for this type of element. For example, when you call reduce, map, or forEach on an array, V8 can optimize those operations based on what kind of elements the array contains.

Take this array, for example:

const array = [1, 2, 3];

What kinds of elements does it contain? If you’d ask the typeof operator, it would tell you the array contains numbers. At the language-level, that’s all you get: JavaScript doesn’t distinguish between integers, floats, and doubles — they’re all just numbers. However, at the engine level, we can make more precise distinctions. The elements kind for this array is PACKED_SMI_ELEMENTS. In V8, the term Smi refers to the particular format used to store small integers. (We’ll get to the PACKED part in a minute.)

Later adding a floating-point number to the same array transitions it to a more generic elements kind:

const array = [1, 2, 3];
// elements kind: PACKED_SMI_ELEMENTS
array.push(4.56);
// elements kind: PACKED_DOUBLE_ELEMENTS
Adding a string literal to the array changes its elements kind once again.

const array = [1, 2, 3];
// elements kind: PACKED_SMI_ELEMENTS
array.push(4.56);
// elements kind: PACKED_DOUBLE_ELEMENTS
array.push('x');
// elements kind: PACKED_ELEMENTS

....

V8 assigns an elements kind to each array. The elements kind of an array is not set in stone — it can change at runtime. In the earlier example, we transitioned from PACKED_SMI_ELEMENTS to PACKED_ELEMENTS. Elements kind transitions can only go from specific kinds to more general kinds.

THUS, behind the scenes, if you're constantly adding different types of data to the array at run time, the V8 engine has to adjust behind the scenes, losing the default optimization.

As far as constructor vs. array literal

If you don’t know all the values ahead of time, create an array using the array literal, and later push the values to it:

const arr = [];
arr.push(10);

This approach ensures that the array never transitions to a holey elements kind. As a result, V8 can optimize any future operations on the array more efficiently.

Also, to clarify what is meant by holey,

Creating holes in the array (i.e. making the array sparse) downgrades the elements kind to its “holey” variant. Once the array is marked as holey, it’s holey forever — even if it’s packed later!

It might also be worth mentioning that V8 currently has 21 different element kinds.

More resources

Mathias Bynens
  • 144,855
  • 52
  • 216
  • 248
Derek Pollard
  • 6,953
  • 6
  • 39
  • 59
  • 2
    Good stuff. I'm still lost on consructor vs literal. Say I do `const c = new Array(true, 1, 2, 'cat')` and then do `const d = [true, 1, 2, 'cat']`. Does V8 see these any differently? There seems to be some ambiguity on if there's even a difference between the two in general (see [this older SO post](https://stackoverflow.com/questions/931872/what-s-the-difference-between-array-and-while-declaring-a-javascript-ar)). Thanks! – Matt Jun 20 '18 at 03:51
  • 2
    @Matt in that particular instance, there really doesn't appear to be that big of a difference, mainly because you are starting off, in both instances, with an array that has values. The main place that `[]` takes precedence over the constructor is in this type of example: `const array = new Array(3);` – Derek Pollard Jun 20 '18 at 03:56
  • 2
    because effectively, you are creating a holey typed array straight from the jump – Derek Pollard Jun 20 '18 at 03:56
  • Does the straight-from-the-jump holey-ness of the array have anything to do with the `kPreallocatedArrayElements = 4` in `/src/objects/js-array.h` (line 78) of the V8 source? – Matt Jun 20 '18 at 04:00
  • 1
    Yeah, so effectively `V8` ends up transitioning through various types, the initial allocation type being `HOLEY_SMI_ELEMENTS`, then if you were to push a string value, it would have to re-allocate to optimize for `HOLEY_ELEMENTS`. `[]` has none of these issues – Derek Pollard Jun 20 '18 at 04:03
  • `kPreallocatedArrayElements` has nothing to do with holey-ness. `var a = new Array(3)` is an array that contains 3 holes and nothing else. That said, don't worry about holes too much -- Mathias' talk over-emphasizes them a bit. Just write the code that makes sense for your use case, let V8 worry about the rest. – jmrk Jun 20 '18 at 04:56
  • @jmrk as this seemed more of an inquiry on whats going on, rather than practical usage, it didn't seem right to add that into my answer. But yes, definitely prefer readability and maintainability over micro-optimizations – Derek Pollard Jun 20 '18 at 05:06