2

Is there any clean and efficient way to add the contents of one array directly to another array without making an intermediate/temporary copy of all the data?

For example, you can use .push() to add the contents of one array directly onto another like this:

// imagine these are a both large arrays
let base = [1,2,3];
let data = [4,5,6]
base.push(...data);

But, that seemingly makes a copy of all the items in data as it makes them arguments to .push(). The same is true for .splice():

// imagine these are a both large arrays
let base = [1,2,3];
let data = [4,5,6]
base.splice(base.length, 0, ...data);

Both of these seem inefficient from a memory point of view (extra copy of all the data) and from an execution point of view (the data gets iterated twice).

Methods like .concat() don't add the contents of one array to another, but rather make a new array with the combined contents (which copies the contents of both arrays).

I've got some big arrays with lots of manipulations and I'm trying to ease the burden on the CPU/garbage collector by avoiding unnecessary intermediate copies of things and I find it curious that I haven't found such a built-in operation. So far, my best option for avoiding unnecessary copies has been this:

// imagine these are a both large arrays
let base = [1,2,3];
let data = [4,5,6];
for (let item of data) {
    base.push(item);
}

which seems like it's probably not as efficient in execution as it could be if it was one operation and obviously it's multiple lines of code when you'd like it to be one line.

jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • AFAIK it can never be one operation if you want to end up with an array. You can go with custom data structure if you want one operation in merging. Like 2d array but iterating on that would be complex. Even splice and spread operators are not "one operation" even if they use so much extra space – bugwheels94 May 22 '20 at 23:57
  • The browser may optimize `base.push(...data)` (now or in future versions). See for example https://dev.to/uilicious/javascript-array-push-is-945x-faster-than-array-concat-1oki , https://stackoverflow.com/questions/5080028/what-is-the-most-efficient-way-to-concatenate-n-arrays and https://dmitripavlutin.com/javascript-spread-operator-performance-optimization/ – Sebastian B. May 23 '20 at 00:11
  • @SebastianB. - The problem is, if it's a large array and you try to run it in an environment that doesn't optimize it, you get a stack overflow. So, "may get optimized" doesn't really work too well. It is interesting to see that it can get optimized. I just need a way that is always optimized. Your 2nd reference mentions the stack overflow problem in one of the answers. So, I can't code in a way that might do that. – jfriend00 May 23 '20 at 00:15
  • I understand, and its an interesting question. But I fear optimizing by hand is also dangerous in such highly interpreted languages. What works today may not be as performant tomorrow if you push it to the limit. Is using [typed arrays](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays) an option? Or WebAssembly for that part (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/WebAssembly/Memory). Just brainstorming without having used those APIs by myself... – Sebastian B. May 23 '20 at 00:21
  • @SebastianB. - I understand the danger of tweaking an optimization on a moving target. But, I have to avoid structures that can lead to a stack explosion by pushing a giant array onto the stack so I can't really use that unless my code only targets a very specific JS engine known to be safe for my specific condition like nodejs V12+ and then it assumes that the JS engine in nodejs never goes backwards. That's why my current implementation is using the `for` loop in my question. At least it's known safe in all engines. I don't know anything yet about webAssembly - maybe time to learn. – jfriend00 May 23 '20 at 00:43
  • @SebastianB. - I have used nodejs Buffers for some specific circumstances like this (for code meant to run only in nodejs) because it gives you more direct control when you're trying to avoid creating lots of temporary objects. The nodejs Buffer interface is much cleaner to use than the Javascript TypedArray stuff in my opinion (even though the Buffer interface sits on top of TypedArrays. Is there a way to grow a TypedArray in place? – jfriend00 May 23 '20 at 00:59
  • @jfriend00 growing in place doesn't seem to be possible. But I found this about concatenating, maybe it helps: https://exploringjs.com/es6/ch_typed-arrays.html#_concatenating-typed-arrays – Sebastian B. May 24 '20 at 00:24
  • @SebastianB. - The interesting part of that is [`TypedArray.prototype.set()`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray/set). That's what I was hoping a regular array had. I guess you can do this efficiently with a TypedArray that is already big enough, but you can't grow it so I'm not sure how much better this would be than `Array.prototype.concat()` which likely does the same thing under the covers for regular arrays. – jfriend00 May 24 '20 at 00:41
  • Yes, `.set()` is the new stuff :-) Couldn't you create a single buffer big enough to handle all your needs? Depends on your use case, for sure... – Sebastian B. May 24 '20 at 00:51

1 Answers1

1

Per, Sebastian's helpful comments, the ideal case would be the fast-path optimization for code like this:

array.push(...data)

that V8 now deploys where it detects data as a known type of iterable and then takes optimized shortcuts to grow the target array once and then copy the data over without making the intermediate copy on the stack.

But, apparently the current V8 does not apply such an optimization to this specific case. When I tried this in node v14.3:

const targetCnt = 100_000;
const sourceCnt = 100_000_000;

// create initial arrays
let target = new Array(targetCnt);
let source = new Array(sourceCnt);

target.fill(1);
source.fill(2);

let b1 = new Bench().markBegin();
target.push(...source);
b1.markEnd();
console.log(`.push(...source): ${b1.formatNs()}`);

I got this error:

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

And, when I reduced the sourceCnt to 1_000_000, then I got this error:

RangeError: Maximum call stack size exceeded

So, I guess that optimization only applies to other circumstances mentioned in the article, not to this one.

So, it seems that until that fast path optimization is considered ubiquitous in all possible targets and you know exactly which situations it is applicable to in your code (I wonder if it would ever be codified in a spec?) and there's no danger of passing an unknown iterable in your code that wouldn't get such preferential treatment, perhaps the best option is to just make your own mini-version of the optimization as a function:

 function appendToArray(targetArray, sourceArray) {
     // grow the target
     let targetIndex = targetArray.length;
     let sourceLen = sourceArray.length;
     targetArray.length = targetIndex + sourceLen;
     // copy the data
     for (let sourceIndex = 0; sourceIndex < sourceLen; sourceIndex++, targetIndex++) {
         targetArray[targetIndex] = sourceArray[sourceIndex];
     }
     return targetArray;
 }
jfriend00
  • 683,504
  • 96
  • 985
  • 979