-3

This is a follow up to my earlier question JavaScript arrays: how do I compact excess length?

I just discovered that all javascript engines I have (Chromium, Node.js, Firefox) are extremely inefficient with sparse arrays! Example:

[,1,2,3,,,,,9,,,].find((x,i,a) => console.log(x,i))

It turns out that all the holes are also hit by this search. But in a real world use case of large sparse arrays this is extremely inefficient!

One might argue that formally one should require that find would return undefined for missing values instead of skipping them and therefore never being able to return them. But that seems to be such a rarely useful property of these iterating functions and instead the spec should say that holes in sparse arrays are to be skipped.

It seems to come down to the iterating with the in vs. the of operator.

So might there be a way of redefining or modifying the iteration of these iterating functions to be based only on indexes actually there? Especially would like to find first index and last index in array, and also next index given an index (that may or may not be the index of a hole).

Gunther Schadow
  • 1,490
  • 13
  • 22
  • 2
    What is wrong with {"1":1,"10":2}? – huseyin tugrul buyukisik Apr 30 '22 at 21:21
  • Generally "why" questions about languages and their implementations don't fit with SO well, and may actually be 'banned' as opinion-based. The only person who can give a definitive answer is one the language developers. Anyways, looking up "javascript sparse arrays", I found this short explanation: https://www.freecodecamp.org/news/sparse-and-dense-arrays-in-javascript/, and https://dmitripavlutin.com/javascript-sparse-dense-arrays/, and on SO, https://stackoverflow.com/questions/52191036/how-does-javascript-create-sparse-arrays – hpaulj Apr 30 '22 at 22:05
  • In languages like Python, there's a clear distinction between dense `list` and "sparse` `dict`, and between dense `numpy` arrays, and `scipy/sparse` matrices. Most SO questions tagged with [sparse-matrix] refer to languages the explicitly develop sparse matrices. In Javascript things are more fluid. It's easy to create arrays with "holes", and apparently, the implementation is free to store them in a dense or a sparse manner. There are pros and cons to each, depending in large part on the "sparsity" of the arrays. – hpaulj Apr 30 '22 at 22:14
  • Digging around confirms my impression that this is a Javascript development issue. While `array`, `delete`, `length` and the `for-in` iterator are part of the original Javascript (as nicely discussed in Crockford's Good Parts 2008), iterators like `findLastIndex` are relatively recent additions. They may have been pioneered in packages like `underscore` and `lodash`, and then added piecemeal to the various engines. But some links suggest `findLastIndex` is still at the proposal stage for ECMAScript. https://github.com/tc39/proposal-array-find-from-last – hpaulj May 02 '22 at 15:06
  • `underscore`, https://underscorejs.org/docs/underscore-esm.html, implements `find(Last)Index` with a `for (; index >= 0 && index < length; index += dir)` where `length` is just the array's property. Clearly it was written with a dense array and accurate `length` in mind, not your kind of sparse array with lots of trailing holes. – hpaulj May 02 '22 at 15:39
  • Crockford makes the case for a C like enumeration of an array: "Since JavaScript’s arrays are really objects, the for in statement can be used to iterate over all of the properties of an array. Unfortunately, for in makes no guarantee about the order of the properties, and most array applications expect the elements to be produced in numerical order. Also, there is still the problem with unexpected properties being dredged up from the prototype chain." – hpaulj May 02 '22 at 15:49

2 Answers2

0

The length property on an Array takes the last element's index and adds one. So if you have an array with holes between index 0 through 100, and an element at index 101, the length will return 101, as it's the last index + 1.

The above happens regardless of how many elements are in the Array.

This is a documented feature. If you want to ignore empty spaces, just check if the item at whatever index is falsy, and then don't do the logic. You're more than welcome to modify prototypes to do this, at your own peril of course.

gear4
  • 745
  • 6
  • 13
  • This answer literally goes off on a tangent (that of length) lecturing about something that is obvious to anyone who has spent a few minutes with the matter but wholly unaware and unappreciative of the deeper issues to which it responds (which are all settled on the referred-to question). – Gunther Schadow May 01 '22 at 13:22
  • @GuntherSchadow you are approaching the entire idea of JavaScript incorrectly. {} is the correct choice. [] is the wrong choice. it's that simple – gear4 May 01 '22 at 16:19
0

The answer is: yes, find, findIndex, findLast, findLastIndex are indeed not efficient on large sparse arrays, but no, that isn't true for all the iterating array functions.

In fact, forEach, filter, map, and reduce, even some and every, do not call the callback function on the holes in the sparse arrays, which I tested in Chromium, Firefox, and Node.JS. Therefore if the intention of using findLastIndex is to replace naive use of the array length property which is unreliable (and therefore generally useless if often practically useful when ignoring many sparse array scenarios) then the following can be done:

(function(arr) {
  try {
    return arr.reduceRight(
      function(r,x,i,a) {
         console.log(r,x,i,a);
         throw i;
      }, -1);
  } catch(r) {
    if(typeof r == 'number')
      return r;
    else
      throw r;
  }
})([,,,3,,,6,,,9,,,])

I don't know if there is a gentler way of breaking out of such an iterator function loop other than throwing the result value, but it works and it cuts short what would otherwise continue after the result has already been found.

In case of find(Index) from the beginning (instead of the end), where a sparse array could also cause millions of iterations before even the first element is found, the same can be achieved with forEach:

let testArray = [];
testArray[20000000] = "first";
testArray[40000000] = "last";
(function(arr) {
  try {
    arr.forEach(
      function(x,i,a) {
         console.log(x,i,a);
         throw i;
      });
    return -1;
  } catch(r) {
    if(typeof r == "number")
      return r;
    else
      throw r;
  }
})(testArray)

As you run this with different index values (creating different size initial sparseness) you notice that (again on all three, Chromium, Firefox, and Node.JS) the implementation is inefficient as it seems to internally iterate instead of maintaining a direct pointer to the first actual index, last actual index, and potentially even next-pointers to skip over large holes. But this is OK, those are optimizations that can be added, at least the specification doesn't require that the call-back is invoked on the holes of a sparse array as it seems to be the case with find (which I think is a bad idea).

The following goes into my own general toolbox from now on:

Array.prototype.findActualLastIndex = function(f) {
    try {
    return this.reduceRight(
        function(r, x, i, a) {
        if(f(x, i, a))
            throw i;
        else
            return r;
        }, -1);
    } catch(r) {
    if(typeof r == 'number')
        return r;
    else
        throw r;
    }
}

Array.prototype.findActualIndex = function(f) {
    try {
    return this.reduce(
        function(r, x, i, a) {
        if(f(x, i, a))
            throw i;
        else
            return r;
        }, -1);
    } catch(r) {
    if(typeof r == 'number')
        return r;
    else
        throw r;
    }
}

Hope this can help others. It's not running optimally with huge spare regions at the ends, but it is the best we can get out of the present version of javascript engines.

E_net4
  • 27,810
  • 13
  • 101
  • 139
Gunther Schadow
  • 1,490
  • 13
  • 22