I have shipped an online (grid-based) videogame that uses the TCP protocol to ensure reliable communication in a server-client network topology. My game works fairly well, but suffers from higher than expected latency (similar TCP games in the genre seem to do a better job at keeping latency to a minimal).
While investigating, I discovered that the latency is only unexpectedly high for clients running Microsoft Windows (as opposed to Mac OS X clients). Furthermore, I discovered that if a Windows client sets TcpAckFrequency=1
in the registry and restarts their machine, their latency becomes normal.
It would appear that my network design did not take into account delayed acknowledgement:
A design that does not take into account the interaction of delayed acknowledgment, the Nagle algorithm, and Winsock buffering can drastically effect performance. (http://support.microsoft.com/kb/214397)
However, I'm finding it nearly impossible to take into account delayed acknowledgement in my game (or any game). According to MSDN, the Microsoft TCP stack uses the following criteria to decide when to send one ACK on received data packets:
- If the second data packet is received before the delay timer expires (200ms), the ACK is sent.
- If there are data to be sent in the same direction as the ACK before the second data packet is received and the delay timer expires, the ACK is piggybacked with the data segment and sent immediately.
- When the delay timer expires (200ms), the ACK is sent.
Reading this, one would presume that the workaround for delayed acknowledgement on Microsoft's TCP stack is as follows:
- Disable the Nagle algorithm (TCP_NODELAY).
- Disable the socket's send buffer (
SO_SNDBUF
=0), so that a call tosend
can be expected to send a packet. - When calling
send
, if no further data is expected to be sent immediately, callsend
again with a single-byte of data that will be discarded by the receiver.
With this approach, the second data packet will be received by the receiver at around the same time as the previous data packet. As a result, the ACK
should get sent immediately from the receiver to the sender (emulating what TcpAckFrequency=1
does in the registry).
However, from my testing, this improved latency only by about a half of what the registry edit does. What am I missing?
Q: Why not use UDP?
A: I chose TCP because every packet I send needs to arrive (and be in order); there are no packets that arn't worth retransmitting if they get lost (or become unordered). Only when packets can be discarded/unordered, can UDP be faster than TCP!