0

Is there an idiom for iterating over large datasets in ES6 to avoid browser timeout?

Let's say I need to do something like generate 16 million cubes or something and that a straight forward loop times out the browser.

function generateCubes(num) {
  var cubes = [];
  for (var ii = 0; ii < num; ++ii) {
     cubes.push(generateCube());
  }
  return cubes;
}

var cubes = generateCubes(16000000);

So I can turn that into a async callback like this

function generateCubes(num, callback) {
  var maxPerIteration = 100000;
  var cubes = [];

  function makeMore() {
    var count = Math.min(num, maxPerIteration);
    for (var ii = 0; ii < count; ++ii) {
      cubes.push(generateCube());
    }
    num -= count;
    if (count) {
      setTimeout(makeMore, 0);
    } else {
      callback(cubes);
    }
  }
  makeMore();
}

but sadly I suddenly have to restructure all my code

generateCubes(16000000, function(cubes) {
   ...
   // all the code that used to be after cubes = generateCubes
});    

I can turn that into something promise based but that only adds to the amount of boilerplate.

In either case I suppose I could write a generic version

function generateThings(factory, num, callback) {
  var maxPerIteration = 100000;
  var things = [];

  function makeMore() {
    var count = Math.min(num, maxPerIteration);
    for (var ii = 0; ii < count; ++ii) {
      things.push(factory());
    }
    num -= count;
    if (num) {
      setTimeout(makeMore, 0);
    } else {
      callback(things);
    }
  }
  makeMore();
}

In this particular case I'm generating 16 million things which is a kind of iteration. Maybe next I want to iterate over those things.

 function forEachAllThThings(things, op, callback) {
   var maxPerIteration = 100000;
   var num = things.length;

   function doMore() {
     var count = Math.min(num, maxPerIteration);
     for (var ii = 0; ii < count; ++ii) {
       op(things[ii]);
     }
     num -= count;
     if (num) {
       setTimeout(makeMore, 0);
     } else {
       callback();
     }
   }
   doMore();
}

Is there some more ES6 way of doing this that is more concise or more generic?

NOTE: Please don't get hung up on generating cubes. That's not the question. Also it's not just about the timeout issue, it can also be a jank issue. For example I once worked in a project that needed to deserialize a scene graph. A moderately complicated graph might take 5-10 seconds to deserialize (turn into objects). During those 5-10 seconds the browser was frozen.

The solution was similar to forEachAllTheThings above in that we only read through N objects per tick so as not to lock up the browser. It was all custom code. I'm just wondering if some of the new ES6 features provide any kind of simplification of solving the issue of doing lots of work over multiple ticks the same way they seem to simplify async code (as this is in a sense a form of async code)


Update

Based on @Bergi's suggestion of promisifying setTimeout I think this is what was being suggested.

// returns a Promise that resolves in `time` millisecond
function sleep(time) {
  return new Promise(function(resolve, reject) {
    setTimeout(resolve, time);
  });
}

// returns a promise that resolves to an array of things
function generateThings(factory, num) {
  var maxPerIteration = 100000;
  var things = [];

  function makeMore() {
    var count = Math.min(num, maxPerIteration);
    for (var ii = 0; ii < count; ++ii) {
      things.push(factory());
    }
    num -= count;
    return num ? sleep(0).then(makeMore) : things;
  }

  // we need to start off with one promise
  // incase num <= maxPerIteration
  return Promise.resolve(makeMore());
}

function generateCube() {
  return Math.random();  // could be anything
}

generateThings(generateCube, 300000)
.then(function(things) {
  console.log(things.length);
});

I suppose that is slightly ES6ified and a couple of lines smaller assuming you already have sleep in your code (which seems like a reasonable assumption).

gman
  • 100,619
  • 31
  • 269
  • 393
  • What does `generateCube` return? – T.J. Crowder Dec 28 '15 at 11:57
  • Does it matter? I'm looking for something generic. Let's assume it's `function generateCube() { return new Cube(); }` – gman Dec 28 '15 at 11:59
  • @gman: It matters in that there is a large class of things that can efficiently be transferred between threads in browser-based JavaScript, and then a large class of things that can only be cloned (less efficient), and a third class that have to be serialized (a lot less efficient). – T.J. Crowder Dec 28 '15 at 12:00
  • You can use this hackish approach that can make async code feel synchrone: https://davidwalsh.name/async-generators – Andreas Louv Dec 28 '15 at 12:01
  • Yes, if you are looking for ES6 idioms specifically, you should definitely use promises. And no, they will simplify your code instead of adding boilerplate code. – Bergi Dec 28 '15 at 12:01
  • @dev-null, I just got through reading those articles before I posted here. I'm not seeing the solution though – gman Dec 28 '15 at 12:06
  • @dev-null: you should learn about `async`/`await` (the ES7 idiom) first before applying the ES6 generator hack – Bergi Dec 28 '15 at 12:08
  • @Bergi, I'm not seeing how promises make it simpler. In fact converting the above code to use promises adds 2 lines or so of boilerplate per function. http://pastebin.com/856YLyiN I guess I save some on the usage but that's only true if I use them a lot. – gman Dec 28 '15 at 12:12
  • @gman: Avoid the [`Promise` constructor antipattern](http://stackoverflow.com/q/23803743/1048572) - you should only promisify `setTimeout`. Then you can `return num ? sleep(0).then(makeMore) : things;` from `makeMore`. – Bergi Dec 28 '15 at 22:57

1 Answers1

2

I'd probably offload the generation of the cubes to a web worker, which won't have the timeout problem, assuming that the cubes consist only of JavaScript basic types and so could be posted to the main UI thread when ready. Ideally, the cubes would be transferrable objects so you wouldn't have to clone them, but rather transfer them, from the worker thread to the main UI thread.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875