How does the operating system load balance between multiple processes accepting the same socket?

Question

I'm reading the docs of the cluster module in Node.js:
http://nodejs.org/api/cluster.html

It claims the following:

When multiple processes are all accept()ing on the same underlying resource, the operating system load-balances across them very efficiently.

This sounds reasonable, but even after a couple of hours of googling, I haven't found any article or anything at all which would confirm it, or explain how this load balancing logic works in the operating system.

Also, what operating systems are doing this kind of effective load balancing?

i guess you can't find anything related to it because js is not supposed to be OS dependent. i don't know why that documentation is mentioning about OS, maybe it's considering Chrome as the OS. — jondinham, Sep 19 '12 at 13:10
Of interest, https://www.citi.umich.edu/u/cel/linux-scalability/reports/accept.html — Steve-o, Sep 19 '12 at 13:46
@Paul - Node.js is a server-side framework, it's not running in the browser. — Venemo, Sep 19 '12 at 15:57
@Steve I also found that link while googling, but it deals with a different problem in an old version of the Linux kernel. — Venemo, Sep 19 '12 at 16:31
Research the 'accept mutex' as used in Apache [ref](http://www.fmc-modeling.org/category/projects/apache/amp/4_3Multitasking_server.html): all waiting threads wake on each new connection. — Steve-o, Sep 19 '12 at 17:06

janneb · Accepted Answer · 2012-09-19T21:09:15.300

"Load balancing" is perhaps a bit poor choice of words, essentially it's just a question of how does the OS choose which process to wake up and/or run next. Generally, the process scheduler tries to choose the process to run based on criteria like giving an equal share of cpu time to processes of equal priority, cpu/memory locality (don't bounce processes around the cpu's), etc. Anyway, by googling you'll find plenty of stuff to read about process scheduling algorithms and implementations.

Now, for the particular case of accept(), that also depends on how the OS implements waking up processes that are waiting on accept().

A simple implementation is to just wake up every process blocked on the accept() call, then let the scheduler choose the order in which they get to run.
The above is simple but leads to a "thundering herd" problem, as only the first process succeeds in accepting the connection, the others go back to blocking. A more sophisticated approach is for the OS to wake up only one process; here the choice of which process to wake up can be made by asking the scheduler, or e.g. just by picking the first process in the blocked-on-accept()-for-this-socket list. The latter is what Linux does since a decade or more back, based on the link already posted by others.
Note that this only works for blocking accept(); for non-blocking accept() (which I'm sure is what node.js is doing) the issue becomes to which process blocking in select()/poll()/whatever to deliver the event to. The semantics of poll()/select() actually demand that all of them be waken up, so you have the thundering herd issue there again. For Linux, and probably in similar ways other systems with system-specific high performance polling interfaces as well, it's possible to avoid the thundering herd by using a single shared epoll fd, and edge triggered events. In that case the event will be delivered to only one of the processes blocked on epoll_wait(). I think that, similar to blocking accept(), the choice of process to deliver the event to, is just to pick the first one in the list of processes blocked on epoll_wait() for that particular epoll fd.

So at least for Linux, both for the blocking accept() and the non-blocking accept() with edge triggered epoll, there is no scheduling per se when choosing which process to wake. But OTOH, the workload will probably be quite evenly balanced between the processes anyway, as essentially the system will round-robin the processes in the order in which they finish their current work and go back to blocking on epoll_wait().

Awesome, thanks for the explanation. So basically, does this mean that instead of implementing any load balancing, they just trust the kernel to "do the right thing" by giving the new connection to the least busy process? — Venemo, Sep 20 '12 at 08:49
Thanks! This is a great explanation, and matched what we were seeing running on metal with the 2.6 Linux kernel very well. However, running on AWS with the 3.2 kernel, we're seeing something very different - certainly not round-robin. Do you know where one should look for changes between the kernels or different performance on virtualized hardware? — Brett, Dec 03 '12 at 21:14
@Brett: Sorry, no idea. Just for curiosity, what is it that you're seeing on AWS? For a flailing-in-the-dark idea, I'd guess there might be some differences in how the VM host chooses which VM CPU's to schedule? — janneb, Dec 09 '12 at 21:37
@janneb We tried the 3.2 kernel on bare metal, and saw the same thing as on AWS. I asked a question about it here: http://stackoverflow.com/questions/13770826/poorly-balanced-socket-accepts-with-linux-3-2-kernel-vs-2-6-kernel, and commented on a node.js issue here: https://github.com/joyent/node/issues/3241 — Brett, Dec 10 '12 at 13:47

How does the operating system load balance between multiple processes accepting the same socket?

1 Answers1