0

I've been toying with basic client/server communication in C, and I'm trying to determine the best way of detecting a disconnected client. I know that if I want a blocking solution I can simply branch off read(). However, in a non-blocking context, particularly with a network disruption rather than a clean disconnect, will read() still detect the broken connection in a timely manner?

Right now I'm just sending keep-alive messages from the client on an interval and checking for timeouts server-side, and I'm wondering if there's a cleaner way.

pjablons
  • 3
  • 1

2 Answers2

1

Regardless of the blocking mode, a series of TCP send() operations will eventually fail with ECONNRESET if the peer has disconnected for any reason, and a TCP recv() will return zero if the peer has disconnected cleanly.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • Can you document this? Or alternatively, how do you reconcile this claim with the POSIX docs that say `send()` yields an `ECONNRESET` error when "A connection was forcibly closed by a peer"? – John Bollinger Nov 21 '15 at 14:56
  • @JohnBollinger It happens if the peer produces an RST for any reason and also if the local TCP produces a RST. That's what `ECONNRESET` means. One of the reasons a peer produces RST is if you send to a port that is already closed at that end. Another is if it closes while it has data in its receive buffer. Another is if it does a forced close via the SO_LINGER trick. The Posix documentation, which I have not seen, would appear to be merely incomplete rather than unreconcilable with this answer. – user207421 Nov 22 '15 at 23:24
  • The POSIX docs are here at http://pubs.opengroup.org/onlinepubs/9699919799/functions/send.html#, though I quoted in full the description of the condition requiring `send()` to raise `ECONNRESET`, I agree that for TCP, "forcibly closed by a peer" means the peer sends a RST to terminate the connection, but that is not the only way a connection can be severed uncleanly, nor is it a certain to result from a connection being severed on account of a network disruption, such as the OP asks about. It is not part of a clean connection shutdown at all. – John Bollinger Nov 23 '15 at 02:46
  • @JohnBollinger (A) There are two ways a connection can be forcibly severed. (1) The peer issues an RST. (2) The localhost issues an RST. Both result in ECONNRESET. (B) A sequence of sends is *bound* to result in an RST if the network stays down. TCP will time out the retries and reset the connection. This behaviour is required not by POSIX but by RFC 793 and friends. (C) A 'description requiring `send()` to raise `ECONNRESET`' is not the same thing as an exhaustive list defining the *only* conditions under which `send()` can raise `ECONNRESET`. – user207421 Nov 23 '15 at 02:49
  • I agree that POSIX does not forbid `send()` to fail for other reasons than those it documents, with the same or different error numbers, but you cannot safely rely on undocumented behavior. Without reference to system specifics, the POSIX specs are all there is to go on. HOWEVER, on close rereading, I find it reasonable to interpret POSIX as saying that `ECONNRESET` shall be raised if *either* peer forcibly closes the connection, and to consider timing out the connection to be a forcible closure. So there you go, sorry for the noise. – John Bollinger Nov 23 '15 at 03:34
  • All the same, if you're going to send messages to the other machine to probe for its continued connection, then at the application layer that machine needs to be prepared to handle those messages. Although this is indeed an alternative to a keep-alives going the opposite direction, it is pretty analogous, and I would not personally consider it any cleaner. – John Bollinger Nov 23 '15 at 03:40
  • @JohnBollinger You're putting words into my mouth. I haven't suggested a hearbeat protocol. All I've done here is specify the conditions under which a disconnect is detectable. Whether those sends are pings or normal application messages makes no difference. – user207421 Nov 23 '15 at 04:21
  • No, you did not specifically suggest a heartbeat protocol, and I did not say otherwise. It was the OP (albeit in different words). and his actual question was whether there is a cleaner approach. I have simply observed that solving his problem based on the mechanism you describe does not yield a fundamentally different or cleaner approach than what he is already doing. – John Bollinger Nov 23 '15 at 14:04
-1

Generally speaking, you cannot affirmatively detect an uncleanly-disconnected client.

Edit: that is, there is no general-purpose are_you_still_there() function, and there is no way to implement one that you can rely upon always to swiftly deliver "no" responses with neither false negatives nor false positives.

If you are using a connectionless network protocol (i.e. UDP), then you cannot affirmatively detect that a client has permanently disappeared at all.

Edit: TCP provides for automatic data retransmission in the event that the receiver does not acknowledge receipt. Multiple retransmissions of the same data are possible, in which case the timeout between (re)transmissions increases exponentially. After a threshold number of retransmissions or a threshold retransmission timeout is reached without an acknowledgement, a TCP subsystem will deem the connection broken, but this may take several or even tens of minutes, depending on the local protocol implementation and parameters.

You can rely on a read() to signal an error if the TCP implementation detects connection breakage, either by timeout or by the remote peer forcibly severing the connection. It may be tricky or impossible, however, to ensure that such an error is delivered to the application in a "timely" manner if your notion of "timely" is much different from the TCP implementation's.

One generally deals with this problem by employing [edit: application-level] timeouts to judge when to release resources devoted to servicing a client, or possibly via a higher-level protocol that provides for a heartbeat signal or the equivalent (which basically still boils down to timeouts).

Edited to add: implementation of an "are you there" message in the application-layer protocol can provide a means to solicit a response from the peer, so as to be able to measure timeout and/or detect connection closure during times when the connection otherwise would be idle. Inasmuch as this more or less amounts to a means for the local machine to solicit a heartbeat from the remote peer, it's similar to what you already have.

On the other hand, it does provide a framework wherein the local machine can attempt to send unsolicited messages to the peer without interfering with the application. That way, if the TCP implementation has marked the connection closed, whether because of TCP-level timeout or because the other end closed it then the local machine will be notified of the closure at that time.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • How exactly do you reconcile your own first sentence with your Posix quote? – user207421 Nov 22 '15 at 23:24
  • @EJP, I clarified what I mean in my edit, which I believe preceded your comment. More specifically, however, TCP provides no way to determine the state of an unresponsive peer. The protocol's own response to the problem is to time out connections after too many unacknowledged retransmissions, but that speaks only to the local machine's state (it considers the connection closed); the remote machine may still consider the connection open, and absent timeout-triggering probing, it might resume communication later. – John Bollinger Nov 23 '15 at 02:57