Sockets, TCP states and the write systemcall

Question

I have been working with a simple server that sends a heartbeat packet every 30 seconds to a client who then acknowledges the heartbeat with a heartbeat reply packet. When I brutally terminate the server by sending it SIGKILL, SIGSEGV the client discovers this via select() and read() system calls readily enough. Then I started wondering what happens when you do that just before the client writes its heartbeat reply packet so I put a 20 second sleep into the client code and killed the server in the mean time but found that the client side write nevertheless succeeds. Trying a second write immediately afterwards triggered the expected SIGPIPE signal and write returned EPIPE. As far as I can tell this is normal behaviour, however, just out of curiosity I printed out the client-side tcp status. It turned out to be:

TCP_ESTABLISHED - Before sending the server SIGKILL.
TCP_CLOSE_WAIT - After the server-side SIGKILL before the first client-side write.
TCP_CLOSE - After the first and second write attemps.

So my questions are:

Why does the first write not raise SIGPIPE and return EPIPE?
can I conclude that if the TCP state is TCP_CLOSE after the first write that the connection to the server is down or do I have to resend the data one more time to be sure?

A diagram of what is happening as I understand it at the moment:

                       server                               client

          [ESTABLISHED]  |                                     | [ESTABLISHED] 
 SIGKILL or close () --> |                                     |  
          [FIN_WAIT_1]   |------------FIN M------------------->| [CLOSE_WAIT] 
                         |                                     |            ---\
          [FIN_WAIT_2]   |<-----------ACK M+1------------------|               |  
                         |                                     |               |   a read performed after a
          [TIME_WAIT]    |<-----------FIN N--------------------| [LAST_ACK?]   |-- serverside SIGKILL returns 0
                         |                                     |               |   but write succeeds
                         |------------ACK N+1----------------->| [CLOSE]       |
                         |                                     |            ---/
                         |                                     | 
                         |                                     |            ---\
                         |                                     | [CLOSE]       |   After the first write returns
                         |                                     |               |   the TCP/IP state is CLOSED 
                         |                                     | [CLOSE]       |   but even so only the a second 
                         |                                     |               |   returns EPIPE and raises SIGPIPE.
                         |                                     | [CLOSE]       |   
                         |                                     |               v

possible duplicate of [Writing to a closed, local TCP socket not failing](http://stackoverflow.com/questions/11436013/writing-to-a-closed-local-tcp-socket-not-failing) — jxh, Jul 10 '14 at 16:56

score 3 · Answer 1 · answered Jul 10 '14 at 16:08

3

Why does the first write not raise SIGPIPE and return EPIPE?

TCP is asynchronous. Your write only copies the data to the socket buffer and returns. The TCP stack takes over in the background and works to send that data. In other words, when send/sendmsg/write returns it does not mean that data has been sent yet.

When the server is killed, the kernel does close on the socket for you, sending outstanding data followed by FIN, which puts your client socket into TCP_CLOSE_WAIT state. It is a half-open connection state and the client still can send data, provided the server expects it.

Your client sends more data but the server OS responds with RST because there is no process to handle the incoming data. Which puts your client socket into TCP_CLOSE.

can I conclude that if the TCP state is TCP_CLOSE after the first write that the connection to the server is down or do I have to resend the data one more time to be sure?

TCP_CLOSE is the final TCP state. Not sure exactly what you are asking, but if you need to make sure that the other peer received and processed your data, you need to send some application level message back.

answered Jul 10 '14 at 16:08

Maxim Egorushkin

131,725
17
180
271

Regarding your last quote. If I understand you correctly, after the first write, I can't be sure that the server is down until I send application level data such as repeating my heartbeat reply packet to be sure the server is really down? The thing is that by the time the first write is done the client already knows that the socket is closed because it is in TCP_CLOSE state and it puzzled me a bit that it doesn't just report by raising SIGPIPE right after the first write. It's not like the client is going to pop back to TCP_ESTABLISHED again when I send more data. – os x nerd Jul 11 '14 at 08:19
1

_after the first write, I can't be sure that the server is down until I send application level data such as repeating my heartbeat reply packet to be sure the server is really down?_ - when you receive a FIN it is ambiguous whether you can still send data, because the connection can either be half-open or closed. The first write does not receive a response from the server immediately, RST arrives later on, which you discover on the second write. – Maxim Egorushkin Jul 11 '14 at 08:56
So basically even though the client is in TCP_CLOSE state after the first write that is no guarantee that the server is unreachable and I cannot know that until RST arrives which is after the second write. So the correct way to check for a broken connection on write is to (1) Do a write, (2) test for TCP_CLOSED, (3) if state is TCP_CLOSED write the data again to be sure an RST has arrived and the connection is truly closed? – os x nerd Jul 11 '14 at 09:18
1

See the [passive close scenario on the TCP state diagram](https://en.wikipedia.org/wiki/Transmission_Control_Protocol#mediaviewer/File:Tcp_state_diagram_fixed_new.svg). After receiving `FIN` the state is `CLOSE_WAIT` and you can still send. In state `CLOSED` you can't send. State `CLOSED` arises in you case when you have sent data (first write) and received `RST` back. – Maxim Egorushkin Jul 11 '14 at 09:30
_State CLOSED arises in you case when you have sent data (first write) and received RST back_. I knew that, I had found a post with that exact diagram in it. What is still bugging me is: If I detect state TCP_CLOSED after the first write should I resend the data or just keep on, assume that write #1 succeded and send the next app level message in line. That write would trigger EPIPE in which case I'd have to resend the last two messages since I know write #1 might have failed and I'm now sure write #2 did fail. – os x nerd Jul 11 '14 at 10:27
1

How do you detect TCP states from your application? Normally, application-level protocols with reliable delivery include sequence numbers with each message, so that receivers can detect message gaps and request redelivery or ignore duplicate messages. For example, see FIX protocol sequence numbers. – Maxim Egorushkin Jul 11 '14 at 10:34
I was told to use `getsockopt(p_socket_fd, SOL_TCP, TCP_INFO, &s_tcp_info, &tcp_info_length )` which made sense to me since the function takes the socket FD I'm interested in as an argument and judging from that state diagram you linked to the TCP state machine isn't going anyplace from the state TCP_COLSED except (in this case) via the listen() system call... but maybe I misunderstood something. – os x nerd Jul 11 '14 at 10:59
Oops that should have been the connect() system call not listen() since the process I'm interested is the client. – os x nerd Jul 11 '14 at 11:13

Sockets, TCP states and the write systemcall

1 Answers1