Does promises and async / await create a lot of threads if it is batch processing and is it better than the synchronous version?

Question

If it is a JavaScript program written using Node.js that will look through all employees, fetch some data, do some calculation, and post it back to another server:

// without error handling to keep it simple for now

for (let employee of employees) {
  new Promise(function(resolve) {
    fetch(someUrlToServerA + employee.id).then(resolve);
  }.then((data) => {
    let result = doCalculations(data);
    return postData(someUrlToServerB + employee.id, result).then(resolve);
  }.then(() => console.log("Finished for", employee.id));
}
console.log("All done.");

If written using async/await, it maybe roughly equivalent to:

(async function(){
  for (let employee of employees) {
    data = await fetch(someUrlToServerA + employee.id);

    let result = doCalculations(data);
    await postData(someUrlToServerB + employee.id, result);

    console.log("Finished for", employee.id);
  }
  console.log("All done.");
})();

Let's say if there are 6000 employees, then won't the program (running using Node.js) keep on making requests to ServerA, and in fact print out "All done" almost instantly (maybe within seconds), but now just have 6000 threads all trying to get data from ServerA, and then do calculations, and post to ServerB? Would there be a better way to do it?

It seems there might be some benefits to making requests in parallel: if each request to ServerA takes 3 seconds, then making parallel requests to it probably will save some time if it can return 4 requests within the 3 seconds. But if ServerA is sent to many requests at the same time, then it may just bottle up the requests and just be able to process a few requests at a time. Or, using this method, does the system actually limit the amount of simultaneous fetches by limiting the number of connections at the same time. So let's say if it limits 4 connections at the same time, then "All done" is printed quickly, but internally it is processing 4 employees at the same time, so it is alright? If ServerA and ServerB don't complain about having several requests at the same time, and the calculation, let's say take milliseconds to finish, then this method may take 1/4 of the time to finish compared to the synchronous version?

Why don't you try it out? You'll see that code following an `await` will not execute before the corresponding promise has resolved, i.e. the request has returned a response. You can easily test this yourself. — trincot, Dec 29 '19 at 08:24
I was just thinking in mind the principles of promises and async/await and how they would work. At first I thought I would not have a large amount of data and post it 6000 times to a server (maybe identified as a hacker if I do). But I guess I could make 6000 requests to a local server... if I set up a RoR server or nginx or apache on my local machine, and set a script that delays returning until 3 seconds later, that should simulate the situation. At first I was only thinking about the principle of how promise/await/async works — nonopolarity, Dec 29 '19 at 08:30

trincot · Accepted Answer · 2020-01-06T06:28:25.320

First of all, JavaScript executes your JavaScript code typically with one thread, whether you use promises or not. Multiple threads can come into play when you use Web Workers, and also in lower-level, non-JavaScript code that JavaScript relies on (like file I/O, HTTP request handling, ...etc).

The first piece of code is not well designed, as the for loop executes synchronously, so the next iteration will not wait for the promise of the previous iteration to resolve.

Because of this, the requests will indeed all be triggered at almost the same time, and "done" will be output synchronously (immediately). A server may complain about the many requests it gets in a very short time. Often servers set a maximum limit on the number of requests per time unit, or (in the worst case) they may just go down under the load.

Also:

You are employing the promise constructor antipattern: don't create a new Promise when you already have a promise (returned by fetch)
The promise returned by fetch does not resolve to the data directly. Instead it resolves to a response object that exposes methods to get to the data asynchronously.

Here is a possible way to chain the promises, so the next fetch will only happen when the previous one had a response:

let promise = Promise.resolve();
for (let employee of employees) {
    promise = promise.then(() => fetch(someUrlToServerA + employee.id))
        .then((response) => response.json()) // assuming you get data as JSON
        .then((data) => postData(someUrlToServerB + employee.id, doCalculations(data)))
        .then(() => console.log("Finished for", employee.id));
}
promise.then(() => console.log("All done."));

Asynchronous "recursion"

This above solution creates all promises in one sweep. To delay the creation of promises until they are really necessary, you could create an asynchronous loop:

(function loop(i) {
    if (i >= employees.length) {
        console.log("All done.");
        return;
    }
    let employee = employees[i];
    fetch(someUrlToServerA + employee.id))
        .then((response) => response.json()) // assuming you get data as JSON
        .then((data) => postData(someUrlToServerB + employee.id, doCalculations(data)))
        .then(() => console.log("Finished for", employee.id)
        .then(() => loop(i+1));
})(0);

The `async` `await` version

Because of the async and await keywords, the for loop here does not do all iterations synchronously, but only gets to the next iteration when the promises created in the previous iteration have been resolved. The second code snippet is a better version than the first when it comes to doing things one after the other. Again, it misinterprets the value that the fetch promise resolves to. It resolves to a response object, not to the data. You should also declare data as a variable or it will be global (in sloppy mode):

(async function(){
    for (let employee of employees) {
        let response = await fetch(someUrlToServerA + employee.id);
        let data = await response.json();
        let result = doCalculations(data);
        await postData(someUrlToServerB + employee.id, result);
        console.log("Finished for", employee.id);
    }
    console.log("All done.");
})();

Running in parallel

Although JavaScript cannot execute multiple lines of its code in parallel, the underlying APIs (which may rely on non-JS code and Operation System calls) can operate in parallel. So indeed the processes that deal with HTTP requests and inform JavaScript (via its event queue) that a request has a response, can run in parallel.

If you want to go that way, then you should initiate some (or all) fetch calls synchronously, and use Promise.all to wait for all those returned promises to resolve.

Your first piece of code would then need to be rewritten as:

let promises = [];
for (let employee of employees) {
    promises.push(fetch(someUrlToServerA + employee.id)
        .then((response) => response.json()) // assuming you get data as JSON
        .then((data) => postData(someUrlToServerB + employee.id, doCalculations(data))
        .then(() => console.log("Finished for", employee.id)))
}
Promise.all(promises).then(() => console.log("All done."));

Limiting parallelism

If you want a hybrid solution, were the number of pending promises is limited to, let's say, 4, then you need to combine the use of Promise.all (working on an array of 4 promises), with the chaining that is happening in the first code block (using promise = promise.then()).

I'll leave that for you to design. If you have an issue with getting that to work, you can come back with a new question.

about "The first piece of code is not well designed, as the for loop executes synchronously, so the next iteration will not wait for the promise of the previous iteration to resolve." it was something I was thinking: if I wait for the first employee to be processed first, then isn't it true I may as well write the whole program in the usual way (synchronously)? I don't need the JS thread to be free to be doing the UI handling, etc like on the browser — nonopolarity, Dec 29 '19 at 09:24
I don't know how you would write it *synchronously*, as ways to to really wait synchronously for a request to come back have been deprecated (as they are detrimental for the user experience). Maybe you mean something else with *synchronously*. — trincot, Dec 29 '19 at 09:26
does Node.js have synchronous calls, so that network request can be "blocking" like before? That's because Node.js programs can very well be server code or standalone programs. Or maybe just comparing if C# or Ruby can write it synchronously and they also have asynchronous ways to write it — nonopolarity, Dec 29 '19 at 09:38
There are Node packages that provide blocking HTTP requests, like [sync-request](https://www.npmjs.com/package/sync-request), but the documentation warns in bold that it should not be used in production, and rightly so. I see in fact no reason why you should try to avoid the asynchronous pattern: it only gives advantages once you work *with* it (and not against it). — trincot, Dec 29 '19 at 09:54
thanks. You mean if we have C# or Ruby programs that does batch processing (like processing employee data above), we can write it the synchronous way, but once we have a method to write it asynchronously, then we should use the asynchronous method to write it? What if we actually wait till one employee's data is processed before we move onto the next, then isn't the synchronous and asynchronous method close to identical? — nonopolarity, Dec 29 '19 at 11:53
If you use `await` (or the callback of `then`) to wait for issuing the next request, JavaScript is not blocked: it can in the meanwhile deal with other events and other JS code could execute. If there are really no events to deal with, then it didn't hurt either to do it that way. I would *always* go for the asynchronous pattern, even when you have the option to do it synchronously with like C#/Ruby. But C# and Ruby allow multithreading, so then doing a synchronous HTTP request is less of an issue, as it is *not* blocking other threads. — trincot, Dec 29 '19 at 12:31
The first version of the code `promise = promise.then(() => fetch(` is actually quite surprising... it will let the computer go like "BAM, BAM, BAM, ..." setting up *all* the continuation points (the `then(continuationCode)`)... if there are 1000 URLs, then it just set up all the continuation points, even the one that says "All Done", and the program is all done, within the first second of the execution. Now, it is just a lot of continuation points (the callbacks), waiting to be invoked... — nonopolarity, Jan 06 '20 at 06:12
that is such a shocking way to view a program, growing up understanding a computer program working all synchronously — nonopolarity, Jan 06 '20 at 06:12
the async function way is even more interesting: we made JS asynchronous, and now we use the async function, together with the `await`, to make it work "as if" it is synchronous... — nonopolarity, Jan 06 '20 at 06:14
if all the promise setup info going into a table that is in the HEAP, then we can even have a Heap overflow. Imagine if it is processing 4 billion data, and we use promises to do it, and if it just keep on setting up promises and its continuation points (the `then()`), it can be doing nothing much but the promise table has consumed all the computer memory (RAM)... then we have heap overflow, not stack overflow — nonopolarity, Jan 06 '20 at 06:18
Concerning the memory consideration: I injected an other version based on recursion, which avoids this memory build-up and allows the garbage collector to do its thing. It comes closer to the `async` `await` version. — trincot, Jan 06 '20 at 06:29
your Asynchronous "recursion" method is to let a continuation point create a new series of promises... instead of creating all promises in one shot... — nonopolarity, Jan 06 '20 at 06:52
if we were afraid of GOTO statements and spaghetti code, now the continuation points are like spaghetti, except they are modularized spaghetti? I think I might have to understand generator and async function internals to understand all these or else sometimes it feels like it just magically work and don't know what is going on. (one time I read how to generate the "power set" by using recursion on generator function)... i just treated it "happening by magic". — nonopolarity, Jan 06 '20 at 06:58
That's what `async` `await` solves in my opinion: it makes code for such asynchronous processing much easier to read. The downside being that coders may forget the asynchronous nature of execution, leading to other coding errors. You may also be interested in another answer I once gave on [asynchronous looping](https://stackoverflow.com/questions/40328932/javascript-es6-promise-for-loop/40329190#40329190). It gives a few more possibilities to choose from. — trincot, Jan 06 '20 at 08:28

Does promises and async / await create a lot of threads if it is batch processing and is it better than the synchronous version?

1 Answers1

Asynchronous "recursion"

The async await version

Running in parallel

Limiting parallelism

The `async` `await` version