3

I have a parent process which creates 2 server sockets and calls select() on them to wait for new connection. When the connection arrives, a message is sent to a child process (created with fork(), after servers sockets creation, so they are shared).

In this child, calling accept() on the server socket doesn't work. I got a EAGAIN error (non-blocking socket). Whereas calling accept() in the main process works perfectly.

Of course, I don't call accept() in the main process at all, I just tested to check if it worked, and it does.

Why can't I call accept() in a child process after a select() in the parent?

EDIT: The goal here is to create a fixed number of workers (let's say 8) to handle clients connections, as in the prefork model. These connections will be long-connections, not like HTTP. The goal is to load-balance connections between workers.

To do this, I use a shared memory variable which contains for a worker the number of currently connected clients. I want to "ask" the worker with the lowest number of clients to handle a new connection.

That's why I do the select() in the parent, and then send a message to a child process, because I want to "choose" which process will handle the new connection.

The server listen on more than one sockets (one for ssl, one without), that's why I use select() and not directly accept() in children processes, because I can't accept() on multiple sockets in my children workers.

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
Thibaut D.
  • 2,521
  • 5
  • 22
  • 33
  • 2
    EAGAIN is not really an error per sé, it just means the call was non-blocking but there was no connection ready. Just sleep for a bit and try again. Any other error than EAGAIN, of course, is an actual error. – Platinum Azure Apr 13 '12 at 14:07
  • Why not call `accept` first before you fork? Specially since you know it works. – Some programmer dude Apr 13 '12 at 14:18
  • I don't understand what you are doing. Could you please post some minimal code, to allow us to reproduce your observation? – moooeeeep Apr 13 '12 at 14:26
  • Have you called `listen()` before or after forking? – user1202136 Apr 13 '12 at 15:07
  • I edited the question to add informations. – Thibaut D. Apr 13 '12 at 15:42
  • @PlatinumAzure I should not get EAGAIN because select() returned that a connection is available. – Thibaut D. Apr 13 '12 at 15:44
  • @user1202136 Before, in the parent process. Children workers are created inly once, after socket creation (bind, listen, etc) so they inherit them. – Thibaut D. Apr 13 '12 at 15:45
  • @JoachimPileborg Because I don't want to fork() for a each clients. Having 200 clients would mean 200 processes. – Thibaut D. Apr 13 '12 at 15:45
  • Could you copy-paste a small sample code which proves the problem. I think it would be really interesting for referencing, plus, it would help people check for themselves. – user1202136 Apr 13 '12 at 16:20
  • 1
    possible duplicate of [FastCGI / SCGI pre-fork](http://stackoverflow.com/questions/6797222/fastcgi-scgi-pre-fork) – André Caron Apr 13 '12 at 16:48
  • @user1202136 I don't have "small" code, it's part of a my server implementation, splitted into a lot of Python classes. Anyway what is explained here seems to work with blocking sockets ! – Thibaut D. Apr 13 '12 at 19:11
  • @PlatinumAzure Using blocking sockets worked. But I still don't understand why I got a EAGAIN error because select returned that my sockets is available for accepting connection. Anyway, I edited the post to give the solution. – Thibaut D. Apr 13 '12 at 19:12
  • 1
    If you open a file descriptor in a parent process, you can pass that descriptor to a child process using [Unix-domain socket magic](http://www.lst.de/~okir/blackhats/node121.html). – Adam Rosenfield Apr 13 '12 at 19:38

1 Answers1

4

In fact, the problem was not what I first thought. Here is a recap of what I did to have some basic load-balancing of connections between my worker processes.

  • A main process (the parent) creates 2 server sockets, bind() and listen() them (with and without ssl for example)
  • I create 8 children processes with a fork(), so they inherit the parent's sockets
  • The main process runs select() in an infinite loop
  • When one of its two sockets is available, it sends a message to a child over a pipe. The child is determined thanks to a shared memory value, which contains the current number of clients "in the child process". The process which currently handle the lowest number of clients is chosen.
  • This child process then calls accept() on the server socket (the socket to used between the two is passed in the pipe, so the child knows which one to call accept() on)

The problem was that my parent process told a child to accept the socket and re-enter the loop immediately after, which it runs select() again. But if the child has not yet accepted the socket, select() returns again, for the same connection. That's why I got a EAGAIN error, in fact I called accept() twice (or more depending on speedinter process race conditions)!

The solution is to wait for the child to answer something on the pipe like "Hey, I accepted the connection, it's ok!", and then returns to the select() loop.

This works perfectly fine. The implementation in Python is available here for the curious : https://github.com/thibautd/Kiwi !

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
Thibaut D.
  • 2,521
  • 5
  • 22
  • 33