0

I'm working with array chunking at long datasets. I need to create a new array of chunks of a certain size. Currently, I use this solution but it shows bad performance.

function array_to_chunks(data, size){
   let chunks = []
   let d = data.slice()
   while (d.length >= size) chunks.push(d.splice(0, size))
   return chunks
}

I'd like to find some better idea of how to do it fast enough and why my code does not perform well.

Number16BusShelter
  • 590
  • 2
  • 7
  • 17

3 Answers3

4

This is slightly more performant because you don't have to copy the array:

const createGroupedArray = function (arr, chunkSize) {

    if (!Number.isInteger(chunkSize)) {
        throw 'Chunk size must be an integer.';
    }

    if (chunkSize < 1) {
        throw 'Chunk size must be greater than 0.';
    }

    const groups = [];
    let i = 0;
    while (i < arr.length) {
        groups.push(arr.slice(i, i += chunkSize));
    }
    return groups;
};

if you are doing I/O, then use Node.js streams:

const strm = new Writable({
  write(chunk, enc, cb){
     // do whatever
  }
});
  • 1
    `slightly more performant` - by my calculation it is up to 400x faster! depending on original array size and chunk size of course (make that over 700x faster) – Jaromanda X Oct 11 '18 at 01:10
  • 1
    Unfortunately, I have to copy it, as 'data' is being accessed by other methods of object (provided code is an adaptation) – Number16BusShelter Oct 11 '18 at 01:11
  • 2
    I think you misunderstand ... you have to copy it in your code because `.splice` mutates the original ... `.slice` does not alter the original array, therefore, you don't **need** to copy it, as the original array remains untouched – Jaromanda X Oct 11 '18 at 01:13
  • what sizes are you talking about ... as in, length of original array, and length of chunks (note, I think your original code would not return everything if the array length is not an exact multiple of chunk size – Jaromanda X Oct 11 '18 at 01:14
  • @JaromandaX, each element of array is ~434 bytes. and array length is ~500,000 entries – Number16BusShelter Oct 11 '18 at 01:15
  • I don't care about element size ... I care about array size and chunk size – Jaromanda X Oct 11 '18 at 01:16
  • @JaromandaX, Ok now I get it! I'll try to rewrite code and respond – Number16BusShelter Oct 11 '18 at 01:16
  • Works better, but not as much. Thanks anyway for your help – Number16BusShelter Oct 11 '18 at 01:23
  • @FelixRiverwood `slice` should be fine. This approach is the same as the answer I deleted. I'm afraid you can't do better than this in terms of big-O runtime. – slider Oct 11 '18 at 01:40
  • MrCholo reigns supreme! lulz –  Oct 11 '18 at 19:38
0

you can use lodash chunk method, this does what you need

const _ = require('lodash');
_.chunk([1,2,3,4,5,6],2);
Mohammed Essehemy
  • 2,006
  • 1
  • 16
  • 20
0

I am interested to hear your opinion on this approach:

const arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
const size = 5

const chunkIt = (arr, size) => {
  let buckets = []

  // Just create the buckets/chunks storage
  for (let i = 1; i <= Math.ceil(arr.length / size); i++) {
    buckets.push([])
  }

  // Put in the buckets/storage by index access only
  for (let i = 0; i < arr.length; i++) {
    var arrIndex = Math.ceil((i + 1) / size) - 1
    buckets[arrIndex].push(arr[i])
  }

  return buckets;
}

console.log(chunkIt(arr, size))

I did some basic JS benchmarking and it did do well. The idea is to pre-create the buckets since that operation should not be that expensive and then just push by index.

Akrion
  • 18,117
  • 1
  • 34
  • 54