7

I have two arrays, available_items and requested_items. I want to remove elements from requested_items that are missing in available_items. Using forEach does not give the expected result apparently because the internal index increments even when an element is deleted and the next element would have the old index.

Here's a test case (also in this jsbin):

var available_items = [2, 5, 9, 36, 48, 23];
var requested_items = [5, 12, 49, 30, 90, 17];
requested_items.forEach(function(v, i, a) {
  if(available_items.indexOf(v) == -1) {
    console.log("will remove " + i + ' ' + v);
    a.splice(i, 1);
  } else console.log("will keep " + i + ' ' + v);
});
console.log('Resulting request array is ' + requested_items.toString());

The result is:

"will keep 0 5"
"will remove 1 12"
"will remove 2 30"
"will remove 3 17"
"Resulting request array is 5,49,90"

This will be repeated tens of thousands times, so, using libraries (e.g. underscore), is something I want to avoid if they adversely impact performance.

So, my question is, what is the least expensive way to correct this?

Majid Fouladpour
  • 29,356
  • 21
  • 76
  • 127

2 Answers2

13

Use a for loop and count backwards, so you don't have a problem with the index.

for(var i = requested_items.length - 1; i >= 0; i--) {
   // your logic
}

It "feels" hacky but it does the trick.

TetraDev
  • 16,074
  • 6
  • 60
  • 61
Jo David
  • 1,696
  • 2
  • 18
  • 20
  • Still faster than `filter` even with large arrays: http://jsperf.com/reducing-array-with-filter-vs-loop (at least in chrome) – Jo David Jul 25 '13 at 08:48
2

The spec: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/forEach

The range is determined before the forEach function begins executing - so you are calling i = [0 .. 5]... elements a[1], a[2], .. a[5].

The values themselves are determined when the index is visited. So if you delete a[1], then skip to a[2], you're jumping over a value! (In your example, a[1] would up being 49.)

Last important part of the spec: indexes that are deleted are not visited, so it's a silent warning. In a real-life C++ array, you'd see an eqivalent of: 'index 4 is out of range! index 5 is out of range!'

Anecdotally, if I were you I'd probably avoid modifying this array within the loop as a matter of principle. In other programming languages, forEach loops are executed in parallel, and the order of the indexes is not set in stone.. modifying the encompassing structure causes undefined behavior. As you can see from the spec, here it's kind of an... ehh, you shouldn't do that situation, but here's what'll happen if you do...

My solution would be to create a third array and use it instead:

var found_items = [];
requested_items.forEach(function(v, i, a) {
  if(available_items.indexOf(v) !== -1) {
    found_items.push(v);
  }
});

An incredibly hacky way to keep your chosen style would be to use a while() loop to stay on the same index every time you delete an element.

requested_items.forEach(function(v, i, a) {
  if(available_items.indexOf(v) == -1) {
    console.log("will remove " + i + ' ' + v);
    a.splice(i, 1);

    while(available_items.indexOf(a[i]) === -1) {
      console.log("will also remove " + i + ' ' + a[i]);
      a.splice(i, 1);
    }

  } else console.log("will keep " + i + ' ' + v);
});

Ugh, that's ugly.

MST
  • 649
  • 6
  • 4