15

Possible Duplicate:
Are Javascript arrays sparse?

I am learning JavaScript at the moment and have been reading some simple introductions and tutorials. While looking at the Array object I stumbled upon some details, which strike me as very odd, coming from other languages like C/Java/Scala/...


So lets assume we define an array as such:

var arr = ['foo','bar','qux']

We now assign

arr[5] = 'baz'

which results in our array looking like this:

arr
>> ["foo", "bar", "qux", undefined, undefined, "baz"]

And the length is as expected

arr.length
>> 6

JavaScript has kindly expanded our array to the needed length - six - and the new items are set to undefined - except for the one we actually assigned a value to.

From a low level point of view this is horrible memory-wise. Typically an array would be a continuous range in memory - making an array bigger generally involves copying the whole array to a new memory location, sufficient in size. This is a very costly operation.

Now, I do realize that this is likely not what JavaScript engines are doing, as copying around arrays would be crazy expensive and the memory space would be wasted on all these 'undefined' values.

Can someone tell me what actually happens behind the door?

  • Are arrays actually some sort linked lists?
  • Are the 'undefined' array items actually there?
  • How expensive is it to work with large arrays that are mostly filled with 'undefined'?
Community
  • 1
  • 1
fgysin
  • 11,329
  • 13
  • 61
  • 94
  • Doesnt JS just have an automatic garbage collector that will clean up the mess? – Alex Gill Sep 28 '12 at 12:50
  • This does actually appear to be a duplicate. Considering the title and tags of the first question I don't feel too bad for not finding it though... ;) – fgysin Sep 28 '12 at 15:10

5 Answers5

16

In the first version of JavaScript, there were no arrays. They were later introduced as a sub-class of that "mother of all objects": Object. You can test this quite easily by doing this:

var foo = [1,2,3,4];
for (var n in foo)
{//check if n is equal (value and type) to itself, coerced to a number
    console.log(n === +(n) ? 'Number' : 'String');
}

This will log String, time and time again. Internally, all numeric keys are converted to strings. The Length property merely fetches the highest index, and adds 1 to it. Nothing more. When you display your array, the object is iterated, and for each key, the same rules apply as for any object: first the instance is scanned, then the prototype(s)... so if we alter our code a bit:

var foo = [1,2,3,4];
foo[9] = 5;
for (var n in foo)
{
    if (foo.hasOwnProperty(n))
    {//check if current key is an array property
        console.log(n === +(n) ? 'Number' : 'String');
    }
}

You'll notice the array only has 5 own properties, the undefined keys 4-8 are undefined, because there was no corresponding value found within the instance, nor in any of the underlying prototypes. In short: Arrays aren't really arrays, but objects that behave similarly.

As Tim remarked, you can have an array instance with an undefined property that does exist within that object:

var foo = [1,2,undefined,3];
console.log(foo[2] === undefined);//true
console.log(foo[99] === undefined);//true

But again, there is a difference:

console.log((foo.hasOwnProperty('2') && foo[2] === undefined));//true
console.log((foo.hasOwnProperty('99') && foo[99] === undefined));//false

RECAP, your three main questions:

  • Arrays are objects, that allow you to reference their properties with numeric instances

  • The undefined values are not there, they're merely the default return value when JS scans an object and the prototypes and can't find what you're looking for: "Sorry, what you ask me is undefined in my book." is what it says.

  • Working with largely undefined arrays doesn't affect the size of the object itself, but accessing an undefined key might be very, very marginally slower, because the prototypes have to be scanned, too.

Update:

Just quoting the Ecma std:

15.4 Array Objects
Array objects give special treatment to a certain class of property names. A property name P (in the form of a String value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 2^32 1. A property whose property name is an array index is also called an element. Every Array object has a length property whose value is always a nonnegative integer less than 2^32. The value of the length property is numerically greater than the name of every property whose name is an array index; whenever a property of an Array object is created or changed, other properties are adjusted as necessary to maintain this invariant. Specifically, whenever a property is added whose name is an array index, the length property is changed, if necessary, to be one more than the numeric value of that array index; and whenever the length property is changed, every property whose name is an array index whose value is not smaller than the new length is automatically deleted. This constraint applies only to own properties of an Array object and is unaffected by length or array index properties that may be inherited from its prototypes.

An object, O, is said to be sparse if the following algorithm returns true:
1. Let len be the result of calling the [[Get]] internal method of O with argument "length".
2. For each integer i in the range 0≤i
    a. Let elem be the result of calling the [[GetOwnProperty]] internal method of O with argument        ToString(i).
     b. If elem is undefined, return true.
3. Return false.

Elias Van Ootegem
  • 74,482
  • 9
  • 111
  • 149
  • Of course, it's also possible to explictly set a property as `undefined`, in which case the property actually is there. – Tim Down Sep 28 '12 at 13:13
  • Yes, of course, but in that case `.hasOwnProperty` returns true, added a code example to reflect that – Elias Van Ootegem Sep 28 '12 at 13:15
  • You can create numeric properties on regular objects too (although the property name is always converted to a string). The only difference with an array is that setting a numeric property also updates the `length` property. – Tim Down Oct 01 '12 at 09:40
  • Sure, I'm not contradicting this anywhere am I? Basically: the Array prototype augments the `Object` prototype, so even array keys are converted to numbers. Last time I checked, `length` is just a property set to the highest key/name of the array instance + 1. It's updated on every change made to the array, Like I stated in my answer [15.4 Array Objects - first paragraph](http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf). Honestly, I can't see where I'm contradicting anything you said, or am I missing something? – Elias Van Ootegem Oct 01 '12 at 09:54
  • 1
    The sentence I was referring to is *"Arrays are objects, that allow you to reference their properties with numeric instances"*, which seems to me to imply that non-Array objects do not allow this, but I think it's just a small ambiguity. Sorry to be pedantic. – Tim Down Oct 01 '12 at 11:31
  • Don't be sorry, you're right. If ever there were to be a _"Pedantic"_ badge on SO, I'll probably be one of the bookies favourites ;) – Elias Van Ootegem Oct 01 '12 at 11:40
  • So arrays are not continuous memory? What were they thinking by doing this lose of fast access and fetching? – shinzou Jun 25 '18 at 10:03
  • 1
    @shinzou: JS originally didn't have arrays, so they never had the performance benefit of simple fetching to being with. They just augmented the hashtable implementation to use int values for hashing instead. JS is just to high-level a language to be running in browser I guess. That's not to say that underneath it all, some engines might not optimise the array type to actually map onto a non-hashing map. IIRC the V8 engine just creates an underlying C++ style object effectively rendering all lookups O(1) operations – Elias Van Ootegem Jun 25 '18 at 10:07
  • Anyway, I haven't touched JS in quite some time anyway (Golang FTW), not doing any client-side stuff at all, and working a bit closer to the metal now. Not eager to go back to interpreted, messy languages like JS at all – Elias Van Ootegem Jun 25 '18 at 10:08
4

Arrays are just an ordered list of objects. In JavaScript everything is an object, so arrays are not really arrays as we know them :)

You can find little internals here.

For your doubts about working with large arrays... Well, remember that the less calculation you make "client-side", the faster will be your page.

napolux
  • 15,574
  • 9
  • 51
  • 70
  • 4
    Not everything is an object. Strings, numbers, boolean values, undefined and null are not objects. They are literal values (unless you specifically create the ones that can be objects as an object). – James Allardice Sep 28 '12 at 12:53
  • Strings aren't objects? Then why do they have methods? – Philipp Sep 28 '12 at 13:09
  • 3
    @Philipp: They don't. They only appear to because they are automatically converted to `String` objects when a method is called. http://stackoverflow.com/a/9110389/96100 – Tim Down Sep 28 '12 at 13:12
1

Answers:

  1. An array in JavaScript is just the same as an object (i.e. an unordered collection of properties) with a magic length property and extra prototype methods (push() etc.)
  2. No, the undefined items are not there. JavaScript has an in operator that test for the existence of a property that you can use to prove this. So for the following array: var arr = ['foo']; arr[2] = 'bar';, 2 in arr returns true and 1 in arr returns false.
  3. A sparse array should take up no more memory than a dense array whose length is the number of properties actually defined in your sparse array. It will only be more expensive to work with a sparse array when you iterate over its undefined properties.
Tim Down
  • 318,141
  • 75
  • 454
  • 536
0

Most javascript implementations implement arrays as some flavor of binary tree or hash table with the array index as the key, so a large range of undefined objects does not use up any memory.

Philipp
  • 67,764
  • 9
  • 118
  • 153
0

I was told that the arrays come in 2 parts, [value, pointer]. So the pointer of arr[2] is null. When you add a 5 it changes the address from null to point to number 3, which points to number 4, which points to number 5, which is null (so end of array).

Im not sure how true this is as ive never actually checked it. But it seems to make sense.

So you cant do the maths like on a c type array (ie to get to value 4 just do starting memory point + 4x (object amount in memory)) but you can do it by following the array peice by peice

exussum
  • 18,275
  • 8
  • 32
  • 65