I have a simple server that uses select(), that looks like this:
#!/usr/bin/env python2
import select, socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.setblocking(0)
server.bind(('localhost', 50000))
server.listen(5)
# TCP Keepalive Options
#server.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
#server.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 1)
#server.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 3)
#server.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5)
inputs = [server]
print "Listening on port 50000"
while True:
readable, writable, exceptional = select.select(inputs, [], inputs)
for s in readable:
if s is server:
connection, client_address = s.accept()
print "New client connected: %s" % (client_address,)
connection.setblocking(0)
inputs.append(connection)
else:
data = s.recv(1024)
if data:
print "Data from %s: %s" % (s.getpeername(), data.replace('\n', ''))
else:
print "%s disconnected" % (s.getpeername(),)
inputs.remove(s)
s.close()
for s in exceptional:
print "Client at %s dropped out" % (s.getpeername(),)
inputs.remove(s)
s.close()
I can connect to it using telnet clients, and it works great. It doesn't respond to the clients, but for this simple example, that's fine.
The problem I'm seeing is this: if a client disconnects without sending a TCP FIN or TCP RST, the server doesn't seem to ever figure out that the client is gone.
I simulate the client disappearing by doing this:
- Run the server
- Connect a telnet client to the server
- Use iptables to block the telnet client from talking to the server
As far as I know, the normal solution to this is to turn on TCP Keepalive, which I do by uncommenting the TCP Keepalive section. When I do that, and follow the same test procedure to make a client disappear in the middle of a connected session, it seems that when the socket times out, select() stops blocking, and returns the client in the "readable" list (as opposed to the exceptional list). This causes my server to try to read data from that socket with s.recv(1024), which crashes the server (s.recv() throws a socket.error exception).
I know I can probably catch the exception and deal with it, but I'm more curious why:
- select() doesn't pick up on the fact that the client disappeared
- I would think one of the most important types of exceptions that select() would be looking for is if the client disappeared. Or is it looking for garbled input?
- Even when I explicitly enable TCP Keepalive, select() still puts the timed out socket into the readable list instead of the exceptional list
Is this expected? Is there a way to have select() put clients that disappeared into the exceptional list instead? Or is it important not to assume that just because select() said a socket is ready for reading, recv() won't fail?
Edit: This question is not a duplicate of the question I asked earlier here, as this one deals specifically with select(), and how it handles exceptions. This one actually includes code which I learned from the other question.