28

A Tale of Two Functions

I have one function that fills an array up to a specified value:

function getNumberArray(maxValue) {
    const a = [];

    for (let i = 0; i < maxValue; i++) {
        a.push(i);
    }

    return a;
}

And a similar generator function that instead yields each value:

function* getNumberGenerator(maxValue) {
    for (let i = 0; i < maxValue; i++) {
        yield i;
    }
}

Test Runner

I've written this test for both these scenarios:

function runTest(testName, numIterations, funcToTest) {
    console.log(`Running ${testName}...`);
    let dummyCalculation;
    const startTime = Date.now();
    const initialMemory = process.memoryUsage();
    const iterator = funcToTest(numIterations);

    for (let val of iterator) {
        dummyCalculation = numIterations - val;
    }

    const finalMemory = process.memoryUsage();

    // note: formatNumbers can be found here: https://jsfiddle.net/onz1ozjq/
    console.log(formatNumbers `Total time: ${Date.now() - startTime}ms`);
    console.log(formatNumbers `Rss:        ${finalMemory.rss - initialMemory.rss}`);
    console.log(formatNumbers `Heap Total: ${finalMemory.heapTotal - initialMemory.heapTotal}`);
    console.log(formatNumbers `Heap Used:  ${finalMemory.heapUsed - initialMemory.heapUsed}`);
}

Running the Tests

Then when running these two like so:

const numIterations = 999999; // 999,999
console.log(formatNumbers `Running tests with ${numIterations} iterations...\n`);
runTest("Array test", numIterations, getNumberArray);
console.log("");
runTest("Generator test", numIterations, getNumberGenerator);

I get results similar to this:

Running tests with 999,999 iterations...

Running Array test...
Total time: 105ms
Rss:        31,645,696
Heap Total: 31,386,624
Heap Used:  27,774,632

Running Function generator test...
Total time: 160ms
Rss:        2,818,048
Heap Total: 0
Heap Used:  1,836,616

Note: I am running these tests on node v4.1.1 on Windows 8.1. I am not using a transpiler and I'm running it by doing node --harmony generator-test.js.

Question

The increased memory usage with an array is obviously expected... but why am I consistently getting faster results for an array? What's causing the slowdown here? Is doing a yield just an expensive operation? Or maybe there's something up with the method I'm doing to check this?

David Sherret
  • 101,669
  • 28
  • 188
  • 178
  • 1
    Sorry for the dumb question... what language is this? `function*` doesn't look like the syntax of Javascript I code in, and neither does the `const` keyword. – sg.cc Oct 07 '15 at 03:28
  • 5
    @sg.cc sorry, I know that can be confusing. It's ES6 javascript—not ES5. You can read about `function*` and the other features used here on [MDN](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/function*). – David Sherret Oct 07 '15 at 03:31
  • What environment are you running this code in? Does it support generators natively or are you using a transpiler such as Babel to generate ES5 code? – Jordan Running Oct 07 '15 at 03:34
  • @Jordan I was just updating the question with that information. No transpiler. It's node v4.1.1 on Windows 8.1. – David Sherret Oct 07 '15 at 03:36
  • 1
    `O(2n)` *is* `O(n)`. If you're going to analyze constant factors, you need much more attention to the relative costs of individual operations than that. – user2357112 Oct 07 '15 at 03:57
  • @user2357112 yes, [you're right](http://stackoverflow.com/a/25777739/188246). I was thinking that through again that it didn't make sense. – David Sherret Oct 07 '15 at 03:59
  • 2
    Try replacing the `let` in the generator function with a function scoped `var`. It seems that the `let` inside the loop incurs a lot of overhead. That will give you an improvement but the generator will be slower because you are adding call stack and scope chain over head for each iteration that is avoided with the `for` loop. – Blindman67 Oct 07 '15 at 04:01
  • @Blindman67 oh wow! That's it. Please provide an answer. Total time: `76ms` for generator, while `109ms` for array when using `var` instead of `let`. – David Sherret Oct 07 '15 at 04:05
  • @Blindman67 using `let` is faster if the code is changed to `let i; for (i = 0; i....` instead of `for(let i = 0;` – David Sherret Oct 07 '15 at 04:11
  • @DavidSherret: Because `let i; for (i = 0; i....` is equivalent to `var i; for (i = 0; i....` for that matter. `let` inside the `for` "header" are special and cause a new binding to be created in each iteration. Also keep in mind that ES6 are still very new so they are likely not optimized yet. – Felix Kling Oct 07 '15 at 04:24
  • @FelixKling I was aware that in this situation `let i; for (i = 0..` should be equivalent to `for (var i = 0..`. I was not aware that a new binding was created in *each iteration* when using `let` in a `for` loop though. Yes, it's probably only a matter of time before they optimize this. – David Sherret Oct 07 '15 at 04:42
  • @DavidSherret: That was the whole point for using `let` in `for` loops: Scope per iteration :) – Felix Kling Oct 07 '15 at 05:25
  • @FelixKling to clarify, I always thought of it as being bound once per `for` loop as opposed to it being rebound every iteration. The more I think though, that makes sense because when using `let` in a `for` loop you can use the variable in closures without worrying about accidentally using the value from the last iteration... (unlike what [happens with `var`](http://stackoverflow.com/a/17557081/188246)). – David Sherret Oct 07 '15 at 05:51
  • ES6 is still largely significantly slower than its ES5 analogues — http://kpdecker.github.io/six-speed/ — but will likely improve in the future. – kangax Oct 26 '15 at 15:27

5 Answers5

12

In the OP's example, the generator will always be slower

While JS engine authors will continue working to improve performance, there are some underlying structural realities that virtually guarantee that, for the OP's test case, building the array will always be faster than using an iterator.

Consider that a generator function returns a generator object.

A generator object will, by definition, have a next() function, and calling a function in Javascript means adding an entry to your call stack. While this is fast, it's likely never going to be as fast as direct property access. (At least, not until the singularity.)

So if you are going to iterate over every single element in a collection, then a for loop over a simple array, which accesses elements via direct property access, is always going to be faster than a for loop over an iterator, which accesses each element via a call to the next() function.

As I'm writing this in January of 2022, running Chrome 97, the generator function is 60% slower than the array function using the OP's example.

Performance is use-case-dependent

It's not difficult to imagine scenarios where the generator would be faster. The major downside to the array function is that it must build the entire collection before the code can start iterating over the elements, whether or not you need all the elements.

Consider a basic search operation which will only access, on average, half the elements of the collection. In this scenario, the array function exposes its "Achilles' heel": it must build an array with all the results, even though half will never be seen. This is where a generator has the potential to far outstrip the array function.

To demonstrate this, I slightly modified the OP's use-case. I made the elements of the array slightly more expensive to calculate (with a little division and square root logic) and modified the loop to terminate at about the halfway mark (to mimic a basic search).

Setup

function getNumberArray(maxValue) {
    const a = [];

    for (let i = 0; i < maxValue; i++) {
        const half = i / 2;
        const double = half * 2;
        const root = Math.sqrt(double);
        const square = Math.round(root * root);
        a.push(square);
    }

    return a;
}

function* getNumberGenerator(maxValue) {
    for (let i = 0; i < maxValue; i++) {
        const half = i / 2;
        const double = half * 2;
        const root = Math.sqrt(double);
        const square = Math.round(root * root);
        yield square;
    }
}
let dummyCalculation;
const numIterations = 99999;
const searchNumber = numIterations / 2;

Generator

const iterator = getNumberGenerator(numIterations);
for (let val of iterator) {
    dummyCalculation = numIterations - val;
    if (val > searchNumber) break;
}

Array

const iterator = getNumberArray(numIterations);
for (let val of iterator) {
    dummyCalculation = numIterations - val;
    if (val > searchNumber) break;
}

With this code, the two approaches are neck-and-neck. After repeated test runs, the generator and array functions trade first and second place. It's not difficult to imagine that if the elements of the array were even more expensive to calculate (for example, cloning a complex object, making a REST callout, etc), then the generator would win easily.

Considerations beyond performance

While recognizing that the OP's question is specifically about performance, I think it's worth calling out that generator functions were not primarily developed as a faster alternative to looping over arrays.

Memory efficiency

The OP has already acknowledged this, but memory efficiency is one of the main benefits that generators provide over building arrays. Generators can build objects on the fly and then discard them when they are no longer needed. In its most ideal implementation, a generator need only hold one object in memory at a time, while an array must hold all of them simultaneously.

For a very memory-intensive collection, a generator would allow the system to build objects as they are needed and then reclaim that memory when the calling code moves on to the next element.

Representation of non-static collections

Generators don't have to resolve the entire collection, which free them up to represent collections that might not exist entirely in memory.

For example, a generator can represent collections where the logic to fetch the "next" item is time-consuming (such as paging over the results of a database query, where items are fetched in batches) or state-dependent (such as iterating over a collection where operations on the current item affect which item is "next") or even infinite series (such as a fractal function, random number generator or a generator returning all the digits of π). These are scenarios where building an array would be either impractical or impossible.

One could imagine a generator that returns procedurally generated level data for a game based on a seed number, or even to represent a theoretical AI's "stream of consciousness" (for example, playing a word association game). These are interesting scenarios that would not be possible to represent using a standard array or list, but where a loop structure might feel more natural in code.

JDB
  • 25,172
  • 5
  • 72
  • 123
9

The terribly unsatisfying answer is probably this: Your ES5 function relies on features that (with the exceptions of let and const) have been in V8 since it was released in 2008 (and presumably for some time before, as I understand that what became V8 originated as part of Google's web crawler). Generators, on the other hand, have only been in V8 since 2013. So not only has the ES5 code had seven years to be optimized while the ES6 code has had only two, almost nobody (compared to the many millions of sites using code just like your ES5 code) is using generators in V8 yet, which means there has been very little opportunity to discover, or incentive to implement, optimizations for it.

If you really want a technical answer as to why generators are comparatively slow in Node.js, you'll probably have to dive into the V8 source yourself, or ask the people who wrote it.

Jordan Running
  • 102,619
  • 17
  • 182
  • 182
  • This is probably the real answer over time. Currently I've discovered, after being pointed in the right direction by @Blindman67, that moving the let outside the for loop `let i; for (i = 0; ...` causes using a generator function to be way to be faster. – David Sherret Oct 07 '15 at 04:14
  • 2
    Cannot reproduce the thing with `let` outside the loop. Today the array generation and iteration is still **more than three times faster** than a simple generator (in my test with Node 10 on Windows 7 x64). Someone screwed this up big time. – Neonit Sep 13 '18 at 12:48
  • Generators return a [generator object](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Generator) which must make a function call on each iteration. A simple array is just direct property access. If you are going to iterate over the entire array every time, then building the array is probably going to be worth it. However, if you are searching the array then the generator might be more performant. Performance depends entirely on how you are using them. – JDB Jan 11 '22 at 15:54
  • For example, in [this performance test](https://jsbench.me/rlkyaawcyd/1) which adds some minor complication to building each array item and then performs a search which on average only needs to iterate over half of the items, then the generator and array functions are practically identical to each other. If each array item needed to make a REST call, then the generator would far outstrip the array, even for small array sizes. – JDB Jan 11 '22 at 16:03
7

FYI this question is ancient in internet terms and generators have caught up (at least when tested in Chrome) https://jsperf.com/generator-vs-loops1

insomniac2846
  • 442
  • 5
  • 8
  • 2
    Even for me now in FF 77.0.1 on Linux, the array is ~6.05 times faster than the generator. – Andrew Jun 17 '20 at 17:11
  • Thank you. I was looking for information if generators are to be used at all. – loop Oct 31 '21 at 08:49
  • 3
    My performance tests show that the array function is still about twice as fast as the the generator in Chrome 97 (https://jsbench.me/8skyaaapbt/1). This makes sense as the generator function is syntactic sugar for a more complicated object which must make a function call on each iteration, while the array only has the one initial function call. The performance profile would change dramatically if the loop was performing a search which exits early, in which case the generator might outperform the array function depending on how complicated it is to build the array. – JDB Jan 11 '22 at 15:49
2

Try replacing the 'let' in the generator function with a function scoped 'var'. It seems that the 'let' inside the loop incurs a lot of overhead. Only use let if you absolutely have to.

Update.

The above is true at time of original posting. Current browsers have optimised heap allocation for block scoped variables.

Blindman67
  • 51,134
  • 11
  • 73
  • 136
0

In fact, running this benchmark now, generators at ~2x faster.

I've modified the code slightly (moved let i) and here is the full gist: https://gist.github.com/antonkatz/6b0911c85ddadae39c434bf8ac32b340

On my machine, these are the results:

Running Array... Total time: 4,022ms Rss: 2,728,939,520 Heap Total: 2,726,199,296 Heap Used: 2,704,236,368

Running Generator... Total time: 2,541ms Rss: 851,968 Heap Total: 0 Heap Used: -5,073,968

I was very curious myself and could not find a proper answer. Thanks @David for providing the test code.

Anton
  • 2,282
  • 26
  • 43
  • 6
    This test is so bad... First of all, the generator test doesn't mutate the array as the simple loop test does. Second of all, run those tests separately in their own scripts. Otherwise, you are tainting the memory results by the previous test (there is no guarantee that memory used by the first test will be GC'ed immediately). If you do those changes you'll see that generator version is still slower (4.4s here compared to 3.6s on my machine). – rchl Nov 14 '20 at 11:13