1

When trying to establish a largeish number of TCP connections in parallel I observe some weird behavior I consider a potential bug in gen_tcp.

The scenario is a server listening on a port with multiple concurrent acceptors. From a client I establish a connection by calling gen_tcp:connect/3, afterwards I send a "Ping" message to the server and wait in passive mode for a "Pong" response. When performing the 'get_tcp:connect/3' calls sequentially all works fine, including for large number of connections (I tested up to ~ 28000).

The problem occurs when trying to establish a lot of connections in parallel (depending on the machine between ~75 and several hundred). While most of the connections still get established, some connections fail with a closed error in gen_tcp:recv/3. The weird thing is, that these connections did not fail before, the calls to gen_tcp:connect/3 and gen_tcp:send/2 were both successful (i.e. returned ok). On the server side I don't see a matching connection for these "weird" connections, i.e. no returning gen_tcp:accept/1. It is my understanding, that a successful 'get_tcp:connect/3' should result in a matching accepted connection at the server side.

I already filed a bug report, there you can find a more detailed description and a minimal code example to demonstrate the problem. I was able to reproduce the problem on Linux and Mac OS X and with different Erlang versions.

My questions here are:

  1. Is anyone able to reproduce the problem and can confirm, that this is erroneous behavior?
  2. Any ideas for a workaround? How to deal with this problem, other starting all the connections sequentially (which takes forever)?
jvf
  • 866
  • 7
  • 13
  • 3
    Without looking at the example my first thought is that the tcp `backlog` size might be too small. Have you tried raising it? (http://erlang.org/doc/man/gen_tcp.html#listen-2) See http://veithen.github.io/2014/01/01/how-tcp-backlog-works-in-linux.html for a description of what it does. – johlo Jun 20 '16 at 12:35
  • I agree with @johlo. I looked into your code and you're not using the `{backlog, B}` option. Possible duplicate of [Erlang TCP sockets get closed](http://stackoverflow.com/questions/32511676/erlang-tcp-sockets-get-closed) – A. Sarid Jun 20 '16 at 15:25
  • @johlo Thanks, great tip and great link. Increasing the backlog solved the problem and the article explains why the call to `gen_tcp:connect/3` returns without error (because it got the SYN/ACK from the server and sent the final ACK), but the server does not end up in state established (because it ignored the final ACK). – jvf Jun 20 '16 at 19:25
  • I started to answer the question, based on the suggestions from @johlo. Feel free to help improve the answer. I am still a bit unclear on why the problem is only identified when the client tries a receive, and not in the send. Is this specific to how send and receive work? Or is this due to timings (server was still retrying `SYN-ACKs` when the send was attempted)? – jvf Jun 21 '16 at 13:45

1 Answers1

3

TCP 3-way handshake Client Server

  connect()│──┐          │listen()
           │  └──┐       │
           │      SYN    │
           │        └──┐ │
           │           └▶│   STATE
           │          ┌──│SYN-RECEIVED
           │       ┌──┘  │
           │   SYN-ACK   │
           │ ┌──┘        │
   STATE   │◀┘           │
ESTABLISHED│──┐          │
           │  └──┐       │
           │     └ACK    │
           │        └──┐ │   STATE
           │           └▶│ESTABLISHED
           ▽             ▽

The problem lies with the finer details of the 3-way handshake for establishing a TCP connection and the queue for incoming connections at the listen socket. See this excellent article for details, much of the following explanation was informed by this article.

In Linux there are actually two queues for incoming connections. When the server receives a connection request (SYN packet) and transitions to the state SYN-RECEIVED, this connection is placed in the SYN queue. If a corresponding ACK is received, the connections is placed in the accept queue for the application to consume. The {backlog, N} (default: 5) option to gen_tcp:listen/2 determines the length of the access queue.

When the server receives an ACK while the accept queue is full the ACK is basically ignored and no RST is sent to the client. There is a timeout associated with the SYN-RECEIVED state: if no ACK is received (or ignored, as is the case here), the server will resend the SYN-ACK. The client then resends the ACK. If the application consumes an entry from accept queue before the maximum number of SYN-ACK retries has been reached, the server will eventually process one of the duplicate ACKs and transition to state ESTABLISHED. If the maximum number of retries has been reached the server will send a RST to the the client to reset the connection.

Coming back to the behavior observed when starting lots of connections in parallel. The explanation is, that the accept queue at the server fills up faster than our application consumes the accepted connections. The gen_tcp:connect/3 calls on the client side return successfully as soon as the receive the first SYN-ACK. The connections do not get reset immediately because the server retries the SYN-ACK. The server does not report these connections as successful, because they are still in state SYN-RECEIVED.

On BSD derived system (including Mac OS X) the queue for incoming connections works a bit different, see the above mentioned article.

jvf
  • 866
  • 7
  • 13