I have been working with a simple server that sends a heartbeat packet every 30 seconds to a client who then acknowledges the heartbeat with a heartbeat reply packet. When I brutally terminate the server by sending it SIGKILL, SIGSEGV the client discovers this via select() and read() system calls readily enough. Then I started wondering what happens when you do that just before the client writes its heartbeat reply packet so I put a 20 second sleep into the client code and killed the server in the mean time but found that the client side write nevertheless succeeds. Trying a second write immediately afterwards triggered the expected SIGPIPE signal and write returned EPIPE. As far as I can tell this is normal behaviour, however, just out of curiosity I printed out the client-side tcp status. It turned out to be:
- TCP_ESTABLISHED - Before sending the server SIGKILL.
- TCP_CLOSE_WAIT - After the server-side SIGKILL before the first client-side write.
- TCP_CLOSE - After the first and second write attemps.
So my questions are:
- Why does the first write not raise SIGPIPE and return EPIPE?
- can I conclude that if the TCP state is TCP_CLOSE after the first write that the connection to the server is down or do I have to resend the data one more time to be sure?
A diagram of what is happening as I understand it at the moment:
server client
[ESTABLISHED] | | [ESTABLISHED]
SIGKILL or close () --> | |
[FIN_WAIT_1] |------------FIN M------------------->| [CLOSE_WAIT]
| | ---\
[FIN_WAIT_2] |<-----------ACK M+1------------------| |
| | | a read performed after a
[TIME_WAIT] |<-----------FIN N--------------------| [LAST_ACK?] |-- serverside SIGKILL returns 0
| | | but write succeeds
|------------ACK N+1----------------->| [CLOSE] |
| | ---/
| |
| | ---\
| | [CLOSE] | After the first write returns
| | | the TCP/IP state is CLOSED
| | [CLOSE] | but even so only the a second
| | | returns EPIPE and raises SIGPIPE.
| | [CLOSE] |
| | v