-2

The following recursive code will cause a stack overflow if the array list is too large. How can you fix this and still retain the recursive pattern?

var list = readHugeList();
var nextListItem = function() {
  var item = list.pop();
  if (item) {
    // process the list item...
    nextListItem();
  }
};
  • A recursive pattern, if recursed enough times, will cause a stack overflow. If the recursive pattern is required, then the possibility of an overflow will exist as well – CertainPerformance Aug 02 '19 at 11:46
  • This sounds like an interview question... – VLAZ Aug 02 '19 at 11:48
  • It's implied that `readHugeList()` reads a whole load of 'something' into memory... but what if `pop()` were tailored to read from a stream instead? –  Aug 02 '19 at 11:50
  • Looking up "recursion trampolines" might give you some pointers. – user3297291 Aug 02 '19 at 11:52
  • 1
    [Yeah, I *thought* the premise sounded familiar...](https://stackoverflow.com/questions/39459236/understanding-event-queue-and-call-stack-in-javascript/) – VLAZ Aug 02 '19 at 12:21

3 Answers3

2

Use the event queue

What you can easily do in this case is offload the recursion onto the event queue.

var list = readHugeList();
var nextListItem = function() {
  var item = list.pop();
  if (item) {
    console.log("processing", item);
    
    // schedule the processing of the next list item
    setTimeout(nextListItem);
  }
};

nextListItem();


function readHugeList() {
  return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; //example
}

This is the simplest you can do - setTimeout will schedule the nextListItem to be called again but first the current stack frame will clear. This way, you never cause a stack overflow and can handle arbitrarily large input.

NOTE: This will not cause a stack overflow through recursion, however, it's going to be slow. The problem is that the minimum timeout length is 4ms, so if each call takes less than 4ms to complete, then the whole operation will take longer. For example, if each operation takes 4ms, then delaying via setTimeout will take 4 times longer. Even if the minimum timeout is removed (for example in Node.js or using setImmediate in Node.js), then it will be faster but still slower than other alternatives - this will still wait for the entire event loop to finish. While faster than a forced minimum 4ms wait, it's still an extra overhead.

However, this method does allow for other things can happen in the meantime, so even if it's slower, it might be a suitable implementation to prevent blocking. One use for this is creating incremental updates. Otherwise it might allow other unrelated tasks to occur, so they are not delayed. So, it's a useful technique but probably not the best - if you have a large enough dataset that you get a stack overflow, then you are definitely going to run into a time issue.

Use a trampoline

An alternative is to employ a trampoline. This is basically a manual implementation of tail call optimisation. Here is a sample implementation:

function trampoline(fn) {
  while (typeof fn === "function") {
    fn = fn();
  }

  return fn;
};

This is a simple implementation of the basics - it doesn't take arguments into account but better implementations do. There are library implementations of a trampoline, so you don't need to write it yourself.

At any rate, a trampoline also requires a change to the recursive function - instead of returning a result, it has to return a thunk - that needs to be evaluated. So, here is how it can work:

var list = readHugeList();
var nextListItem = function() {
  var item = list.pop();
  if (item) {
    console.log("processing", item);
    
    // a thunk for the next list item
    return () => nextListItem();
  }
};

trampoline(nextListItem);


function trampoline(fn) {
  while (typeof fn === "function") {
    fn = fn();
  }

  return fn;
};

function readHugeList() {
  return [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; //example
}

(Note: return () => nextListItem() can also be done simply as return nextListItem but I wanted to be explicit here)

The trampoline will keep calling the function until it stops returning more thunks to be evaluated. And each thunk itself will contain the recursive call to the function. So, instead of the function directly calling itself, you now call it indirectly when evaluating the thunk via fn(). And since this is done in a loop, it still prevents the stack overflow from having a very deeply nested recursion.

Third option use the event queue and cheat a bit

OK, it's not really cheating but you can give yourself an advantage. As mentioned, delegating the recursion to the event queue will solve the stack overflow problem but introduces a speed problem because it imposes a minimum delay to each of the delayed actions. Just to demonstrate how this will affect the performance, here is an example:

var list = readHugeList();
var nextListItem = function() {
  var item = list.pop();
  if (item) {
    console.log("processing", item);
    
    // schedule the processing of the next list item
    setTimeout(nextListItem);
  } else {
    console.log("operation took (in ms):", performance.now() - startTime)
  }
};

var startTime = performance.now();
nextListItem();

function readHugeList() {
  //generate an array of 100 sequential numbers
  return Array.from({length: 100}, (_, i) => i);
}

Recursively going through a list of one hundred items takes about a second (on my machine, at least). Whatever time it takes, it's too long - a hundred items is not that many and won't even cause a call stack overflow in the first place. I will not try it with a number that might cause an overflow as it will work but it will be dreadfully long to wait for it to finish.

However, you can still leverage the event loop via microtasks. A very quick summary is there is two types of tasks the event loop handles

  • macrotasks (like what setTimeout sets) - they have a minimum delay of 4ms and will also allow for other stuff to happen, like browser animations.
  • microtasks - they are higher priority, don't have any minimum delay and will just execute as soon as possible.

If you offload to the microtask queue instead of the macrotask queue, then you will get much faster processing. To do that, you can use a Promise in the browser - resolved promises will add any further processing needed (callbacks added with .then) to the microtask queue.

Here is how this looks and behaves:

var list = readHugeList();
var nextListItem = function() {
  var item = list.pop();
  if (item) {
    console.log("processing", item);
    
    // schedule the processing of the next list item
    Promise.resolve()//resolve immediately
      .then(nextListItem); //adds a microtask
  } else {
    console.log("operation took (in ms):", performance.now() - startTime)
  }
};

var startTime = performance.now();
nextListItem();

function readHugeList() {
  //generate an array of 10 000 sequential numbers
  return Array.from({length: 10000}, (_, i) => i);
}

This takes about second and a half (on my machine). This is comparable with setTimeout with exception that the microtask version process two orders of magnitude more items. That's 10 000 vs only 100.

Community
  • 1
  • 1
VLAZ
  • 26,331
  • 9
  • 49
  • 67
  • 1
    down-voted for big bold text that suggests using the event queue before anything else. see [this Q&A](https://stackoverflow.com/a/43596323/633183) for further clarification on why `setTimeout` is an awful choice – Mulan Aug 03 '19 at 15:07
  • @user633183 yet this is apparently [the intended answer](https://stackoverflow.com/questions/39459236/understanding-event-queue-and-call-stack-in-javascript/). I thought I recognised the pattern, so I mentioned the event queue because of the expected answer in those cases. It then turned out this is literally the same question. Not sure what the original source is. – VLAZ Aug 03 '19 at 15:18
1

You could move the recursive call to the end of the function for a tail call optimization .

This replaces the return value of the call of the function and does not increase the stack.

var list = readHugeList();
var nextListItem = function() {
        var item = list.pop();
        if (!item) return;
        // process the list item...
        return nextListItem();
    };
Nina Scholz
  • 376,160
  • 25
  • 347
  • 392
  • Worth noting that the environment has to support TCO. I thought that Node.js (well, V8) scrapped their TCO implementation somewhat recently (last year?). I'm not sure on the state of TCO support at the moment – VLAZ Aug 02 '19 at 11:58
  • right, i think, it's more a theroretical question of op. in real life, no one would take a recursion for getting an item, of the array. a generator/iterator would serve as well. – Nina Scholz Aug 02 '19 at 12:00
0

Or simply rewrite ex. like this:

var list = (function readHugeList() { return [3, 1, 4, 1, 5, 9, 2] })();
var nextListItem = function() {
  do {
    var item = list.pop();
    if (item) {
      // process the list item...
      console.log("Processing item:", item);
      continue;
    }
  } while (item);
};

nextListItem();
Jan
  • 2,178
  • 3
  • 14
  • 26