3

I accidentally compared a large array and a number with <, and JavaScript locked up for over 5 seconds. What is the expected behavior of this comparison? Is it iterating over the whole array? MDN didn't clarify the situation.

As a concrete example, this code snippet takes over 5 seconds to print done:

var m = [];
m[268435461] = -1;
console.log('start');
if (m < 0) { }
console.log('done');
Mike 'Pomax' Kamermans
  • 49,297
  • 16
  • 112
  • 153
Ken Shirriff
  • 1,654
  • 16
  • 20
  • 1
    You are assigning the 268435461th entry in a sparse array the value -1 - is that is what you wanted to do? – mplungjan Mar 26 '19 at 18:09
  • 5
    You've created a sparse array, which is stored differently from regular arrays to be more memory efficient. They're also slower. –  Mar 26 '19 at 18:09
  • 3
    The comparison is slow because `m` is first coerced to a (huge) string. `let x = []; x[3] = -1; '' + x` gives `,,,-1`. Your `m` is coerced to 268435461 commas followed by -1. At least on modern Chrome, creating a sparse array takes a few milliseconds, and is **not** the reason your example is slow. – Igor Raush Mar 26 '19 at 18:20
  • Why would it coerce it to a string instead of a number? –  Mar 26 '19 at 18:24
  • 1
    Long story short, (loosely) comparing an object to a number invokes the [`ToPrimitive`](https://tc39.github.io/ecma262/#sec-toprimitive) procedure, which invokes `valueOf`, doesn't get back a primitive, and invokes `toString`. – Igor Raush Mar 26 '19 at 18:31
  • Looks like Igor already covered that. – Mike 'Pomax' Kamermans Mar 26 '19 at 20:25
  • @mplungjan Yes, I have a large sparse array. The comparison is what I did by accident. – Ken Shirriff Mar 26 '19 at 20:43

1 Answers1

3

Javascript "arrays" (those with Array prototype, not typed arrays), are just objects, therefore this

var m = [];
m[268435461] = -1;

is exactly the same as

var m = {
    "268435461": -1
}

except that in the first case, m has the Array prototype and a special length property.

However, methods defined in Array.prototype (like forEach or join) are trying to hide that fact and "emulate" sequential arrays, as they exist in other languages. When iterating their "this" array, these methods take its length property, increase the loop counter from 0 upto length-1 and do something with the value under the key String(i) (or undefined if there's no such key)

// built-in js array iteration algorithm

for (let i = 0; i < this.length - 1; i++) {
     if (this.hasOwnProperty(String(i))
         do_something_with(this[String(i)])
     else
         do_something_with(undefined)

Now, length of an array is not a number of elements in it, as the name might suggest, but rather the max numeric value of its keys + 1, so in your case, length will be 268435462 (check it!)

When you do m < 0, that is, compare a non-number to a number, JS converts them both to strings, and Array.toString invokes Array.join, which, in turn, uses the above loop to convert elements to strings and insert a comma in between:

// built-in js Array.join algorithm

target = '';

for (let i = 0; i < this.length - 1; i++) {
    let element = this[String(i)]

    if(element !== undefined)
        target += element.toString()

    target += ','
}

Illustration:

m  = [];
m[50] = 1;
console.log(m.join())

This involves lots of memory allocations, and that's what is causing the delay.

(After some more testing, the allocation are not the deciding factor here, "hollow" loops will cause the same slowdown:

console.time('small-init')
var m = [];
m[1] = -1;
console.timeEnd('small-init')

console.time('small-loop')
m.forEach(x => null)
console.timeEnd('small-loop')

console.time('big-init')
var m = [];
m[1e8] = -1;
console.timeEnd('big-init')

console.time('big-loop')
m.forEach(x => null);
console.timeEnd('big-loop')

That being said, I don't think modern JS engines are that silly, and implement iterations exactly as described above. They do have array-specific optimizations in place, but these optimizations are targeted at "good" sequential arrays, and not at bizarre edge cases like this. Bottom line: don't do that!

georg
  • 211,518
  • 52
  • 313
  • 390
  • Arrays are `Vectors` or `ArrayList`s exactly like they exist in other languages. Sure, JS has a unified way of accessing elements in the same way as properties, but I wouldn't say that JS aren't "sequential". Imo you should just drop the first part of your answer, it doesn't seem relevant to the question. – Bergi Mar 26 '19 at 18:43
  • @Bergi: last time I checked, Ecma spec didn't mention Vectors or ArrayLists, it does, however, say that "A[rrays]'s essential internal methods... [are] the default ordinary object definitions" – georg Mar 26 '19 at 18:48
  • 2
    The ECMA spec doesn't detail memory layout of objects and arrays at all, so it's irrelevant. In every JS engine implementation I've encountered so far, arrays and objects were handled very differently though. – Bergi Mar 26 '19 at 18:52
  • @Bergi: so, why exactly is this code slow? An array optimization should have kicked in and realized that there's only one element in the array, not a gazillion... – georg Mar 26 '19 at 19:01
  • It probably did realise that there are no elements in the array. But as you correctly explained, it still has to build a 268435463-character string, which is the slow thing. – Bergi Mar 26 '19 at 19:12
  • @Bergi: ok, how about https://jsfiddle.net/fbv761uc/ ? All I do is `x.forEach(null)` - no allocations involved, but the "big" array is still very slow. – georg Mar 26 '19 at 19:33
  • I'm not contesting that large arrays are slow (regardless whether sparse or not). I just think that "arrays are plain objects" is both misleading and irrelevant to the question. – Bergi Mar 26 '19 at 19:52
  • @georg JS tried to convert both sides of the `>` to types that can actually be compared, so in this case, it converts both sides to stringss, and `m.toString()` when there are 268,435,461 elements (268,435,460 are `undefined`) will need to allocate about half a gig of RAM just to hold the resulting string. That's going to take a few seconds. – Mike 'Pomax' Kamermans Mar 26 '19 at 20:27
  • @Bergi: according to the spec, arrays are objects. How they are actually implemented in the respective engine's C++ code, is secondary and subject to change. Even as a metaphor, that seems to explain things very well, in particular, why it's not the assignment line, that causes the OP's problems (no allocation of gazillion elements happens), but the comparison line (where gazillion loops are actually being made). – georg Mar 26 '19 at 20:40
  • @Mike'Pomax'Kamermans: as I tried to explain (and with what Bergi disagrees) there are actually no 268,435,461 elements in this array - just one. – georg Mar 26 '19 at 20:42
  • I don't disagree that the array in the question is sparse, has only one element and takes little memory. I just think that the phrasing "*Javascript "arrays" are just objects, […] is exactly the same as […] [array] methods [...] are trying to hide that fact and "emulate" sequential arrays, as they exist in other languages.*" is highly misleading (if not plain wrong) and irrelevant. – Bergi Mar 26 '19 at 20:48
  • On a true, spec-wise, technical note, Arrays are Objects, not "vectors" or "arraylists" or "dictionaries" or any other term not in the spec. On a JS Engine _implementation_ level, the fact that they're modelled as specific C++ or Rust etc. datatypes is mostly irrelevant to this discussion. Arrays, as far as JS is concerned, are a more specific type of Object, with an Array prototype and special handling around numeric property names. Even if the underlying native code, to which no one has access at runtime, uses some kind of indexed list primitive. – Mike 'Pomax' Kamermans Mar 26 '19 at 22:41