22

I have shipped an online (grid-based) videogame that uses the TCP protocol to ensure reliable communication in a server-client network topology. My game works fairly well, but suffers from higher than expected latency (similar TCP games in the genre seem to do a better job at keeping latency to a minimal).

While investigating, I discovered that the latency is only unexpectedly high for clients running Microsoft Windows (as opposed to Mac OS X clients). Furthermore, I discovered that if a Windows client sets TcpAckFrequency=1 in the registry and restarts their machine, their latency becomes normal.

It would appear that my network design did not take into account delayed acknowledgement:

A design that does not take into account the interaction of delayed acknowledgment, the Nagle algorithm, and Winsock buffering can drastically effect performance. (http://support.microsoft.com/kb/214397)

However, I'm finding it nearly impossible to take into account delayed acknowledgement in my game (or any game). According to MSDN, the Microsoft TCP stack uses the following criteria to decide when to send one ACK on received data packets:

  • If the second data packet is received before the delay timer expires (200ms), the ACK is sent.
  • If there are data to be sent in the same direction as the ACK before the second data packet is received and the delay timer expires, the ACK is piggybacked with the data segment and sent immediately.
  • When the delay timer expires (200ms), the ACK is sent.

(http://support.microsoft.com/kb/214397)

Reading this, one would presume that the workaround for delayed acknowledgement on Microsoft's TCP stack is as follows:

  1. Disable the Nagle algorithm (TCP_NODELAY).
  2. Disable the socket's send buffer (SO_SNDBUF=0), so that a call to send can be expected to send a packet.
  3. When calling send, if no further data is expected to be sent immediately, call send again with a single-byte of data that will be discarded by the receiver.

With this approach, the second data packet will be received by the receiver at around the same time as the previous data packet. As a result, the ACK should get sent immediately from the receiver to the sender (emulating what TcpAckFrequency=1 does in the registry).

However, from my testing, this improved latency only by about a half of what the registry edit does. What am I missing?


Q: Why not use UDP?

A: I chose TCP because every packet I send needs to arrive (and be in order); there are no packets that arn't worth retransmitting if they get lost (or become unordered). Only when packets can be discarded/unordered, can UDP be faster than TCP!

Mr. Smith
  • 4,288
  • 7
  • 40
  • 82
  • 1
    If you care a lot about latency, an UDP based protocol is a better choice. There are high level UDP libraries which provide a reliable transport. – CodesInChaos Mar 22 '14 at 21:43
  • 5
    You speak of games, yet most games do rely on UDP. Have you researched how they do this? I guess the question is, what is this application for, and do you prefer reliability over speed (unfortunately, you'll need to prefer one over the other). If it's a real time / action oriented, I would use UDP with some sort of client/server prediction and correction. If not, 200ms latency shouldn't be an issue – Brendan Lesniak Mar 22 '14 at 21:51
  • Delayed acks doesn't cause lag, though you'd normally want to disable the nagle algorithm for interactive apps: – nos Mar 23 '14 at 00:52
  • 1
    @nos They absolutely do, for some applications (such as games). I don't know why you thought that. – Mr. Smith Mar 23 '14 at 00:59
  • 2
    As Brendan mentioned, games tend to use UDP to ensure minimal latency. Look for documentation on how the Quake 3 protocol works -- it's really a very simple concept that would likely fit your use case. I hope someone answers your actual question, though, because it is interesting! – Cory Nelson Mar 23 '14 at 08:56
  • 1
    Why does the delayed ack cause a latency problem? `send` does not wait for ack as long as there is still room in the send buffer (which it can do because send does not promise anything was/will be recieved). And when the TCP window is full I'd expect an ack to be sent immediately. Can you detail the steps that would lead to a lag? – usr Mar 23 '14 at 09:54
  • Why don't you just fix your protocol so that it doesn't suffer from delayed ACK like everyone else does? (They use application-level acknowledgements that the ACKs piggy-back on.) – David Schwartz Jul 14 '14 at 22:12
  • @DavidSchwartz I have been using application-level acknowledgements these past few months. I've been meaning to post a detailed answer explaining (with diagrams) how exactly your suggestion fixes it (for those who deny delayed acks can cause delay). It's important to note though that this only works if at least one party (server or client) has delayed acknowledgements disabled. If both are using delayed acknowledgements, certain read/write patterns will still delay. – Mr. Smith Jul 14 '14 at 22:54
  • @Mr.Smith Most likely, you're still doing something wrong. There should be no need to disable either Nagle or delayed acknowledgements. And if you find you need to, that's the clearest indication that you had better not, because it's hiding whatever pathological behavior your code has triggered, which should instead be fixed in your code. Other people don't have this problem. The most likely suspicion is that you're trying to get the latency down on top of a protocol on top of TCP that was not designed for low latency. Key performance requirements should be designed in, not fixed later. – David Schwartz Jul 15 '14 at 07:10

7 Answers7

18

Since Windows Vista, TCP_NODELAY option must be set prior to calling connect, or (on the server) prior to calling listen. If you set TCP_NODELAY after calling connect, it will not actually disable Nagle algorithm, yet GetSocketOption will state that Nagle has been disabled! This all appears to be undocumented, and contradicts what many tutorials/articles on the subject teach.

With Nagle actually disabled, TCP delayed acknowledgements no longer cause latency.

Mr. Smith
  • 4,288
  • 7
  • 40
  • 82
2

There should be nothing you need to do. All of the workarounds you're suggesting are to help protocols that weren't properly designed to work over TCP. Presumably your protocol was designed to work over TCP, right?

Your problem is almost definitely one or both of these:

  1. You are calling TCP send functions with small bits of data even though there is no reason you couldn't call then with larger chunks.

  2. You did not implement application-level acknowledgements of application protocol data units. Implement these so that the ACKs can piggy-back on them.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • 1
    Please explain the downvote. If I'm unclear, I'd like to know what to clarify. If I'm wrong, I'd like to know why. – David Schwartz Mar 23 '14 at 09:48
  • Neither of these; when I send packets, they are queue'd in a byte buffer, and sent when the socket thread picks them up (at a frequency I specify), 50 small packets will become 1 large packet, if sent within a short enough time-frame (only the socket-thread calls `send` on the socket). And the "single-byte data" I describe functions as #2, which should cause the receiver to ACK. – Mr. Smith Mar 23 '14 at 09:53
  • Your single-byte of data goes in the wrong direction. You need to have the other end acknowledge the application-level data units so that there's a data packet for its ACK to piggy-back on. Your extra call to send just makes things worse -- a small send is the trigger for Nagling, which makes delayed ACK much worse. – David Schwartz Mar 23 '14 at 10:01
  • Both are using the steps I described; the server is sending the single-byte after send, and the client is sending the single-byte after send. Both are necessary for this to work. – Mr. Smith Mar 23 '14 at 10:05
  • No, there's no reason to send an extra byte, it just triggers Nagle, which makes things worse. But you do need to send an acknowledgement of some kind -- a single byte is fine -- to allow the ACK to piggy back. You also should not disable Nagle nor should you shrink the send buffer. Those are workarounds for protocols not designed to work over TCP. Your protocol should be designed to work over TCP. – David Schwartz Mar 23 '14 at 10:08
  • It cannot trigger Nagle, because Nagle is explicitly disabled. The idea behind the single-byte is to have two packets sent to the receiver (so that its firmware would force back an ACK). It was never to allow an ACK to piggy-back onto it, but that sounds like an alternative that could work. I seem to have misinterpreted your #2 point; application-level acknowledgements sounds like it could work! I deeply regret the downvote now... – Mr. Smith Mar 23 '14 at 10:15
  • You shouldn't disable Nagle. Nagle helps prevent the small packets that cause latency problems. Fix the problem at its root, that way you won't have to employ workarounds that have huge potential risks and downsides. – David Schwartz Mar 23 '14 at 10:18
  • I patched it so that the receiver sends a single-byte (junk) to the sender each time one-or-more (non-junk) application-level data units are completely received. However, this made the lag significantly worse; even my Mac OS X machine lags heavily now. Did I miss something else? – Mr. Smith Mar 24 '14 at 01:13
  • Are you sure you aren't doing anything to sabotage yourself? Are you leaving the buffer sizes and Nagling at their default settings? – David Schwartz Mar 24 '14 at 01:38
  • I put `SO_SNDBUF` back to its default setting, but kept Nagle disabled (it doesn't make sense to me why I'd enable Nagle here). – Mr. Smith Mar 24 '14 at 01:44
  • @Mr.Smith Why do you think disabling Nagle will help you? There are lots of ways it can make things worse -- for example, by reducing the number of bytes in the average packet. – David Schwartz Mar 24 '14 at 02:18
  • 3
    You've oblivious of the fact that [efficiency doesn't always translate to low-latency](http://en.wikipedia.org/wiki/Nagle's_algorithm#Interactions_with_real-time_systems). There are very few packets that actually do get sent, and more often than not, they cannot be clumped up (read-write-read-write pattern). – Mr. Smith Mar 24 '14 at 03:53
  • @Mr.Smith I'm not oblivious to the fact, I'm suggesting a superior solution -- leave Nagle enabled, leave the socket buffers at their defaults, and ensure that the other side sends application-level acknowledgements that the TCP ACK can piggyback on. – David Schwartz Mar 24 '14 at 04:02
  • 'they are queue'd in a byte buffer, and sent when the socket thread picks them up (at a frequency I specify), 50 small packets will become 1 large packet' - doesn't sound like a latency-conscious design to me:( – Martin James Mar 24 '14 at 19:35
  • Did you just get a downvote also? Or has that been there a while? – Ben Voigt Sep 15 '14 at 14:58
  • @MartinJames 50 small writes will only become 1 large packet where that provides a benefit. The algorithm is smart as long as you design with it in mind. If you disable Nagle, 50 small writes won't become one large packet even where that is essential. – David Schwartz Apr 24 '16 at 17:07
1

Use reliable UDP libraries and write your own congestion control algorithm, this will definitely overcome your TCP latency problem.

this the following library, which i use for reliable UDP transfers:

http://udt.sourceforge.net/

user3239282
  • 151
  • 1
  • 10
1

With this approach, the second data packet will be received by the receiver at around the same time as the previous data packet. As a result, the ACK should get sent immediately from the receiver to the sender (emulating what TcpAckFrequency=1 does in the registry).

I'm not convinced that this will always cause a second, separate packet to be sent. I know you have Nagle's disabled and a zero send buffer, but I've seen stranger things. Some wireshark dumps might be helpful.

One idea: Instead of your 'canary' packet being only one byte, send a full MSS's worth of data (typically, what, 1460 bytes on a 1500-MTU network).

antiduh
  • 11,853
  • 4
  • 43
  • 66
0

To solve the problem, it's necessary to understand the normal functioning of TCP connections. Telnet is a good example to analyze.

TCP guarantees delivery by acknowledging successful data transmission. The " Ack" can be sent as a message by itself, but this introduces quite some overhead - an Ack is very small message itself but the lower level protocols add extra headers. For this reason, TCP prefers to piggyback the Ack message on another packet it's sending anyway. Looking at an interactive shell via Telnet, there's a steady stream of keystrokes and responses. And if there's a small pause in typing, there's nothing to echo on the screen. The only case when the flow stops is if you have output without corresponding input. But since you can only read so fast, it's OK to wait a few hundred milliseconds to see if there's a keystroke to piggyback the Ack on.

So, summarizing, we have a steady flow of packets both ways, and Ack usually piggybacks. If there's a interruption in the flow for application reasons, delaying the Ack won't be noticed.

Back to your protocol: You apparently don't have a request/response protocol. That means the Ack can't be piggy-backed (problem 1). And while the receiving OS will then send separate Acks, it won't spam those.

Your workaround via TCP_NODELAY and two packets on the sending (Windows) side assumes that the receiving side is Windows too, or at least behaves as such. This is wishful thinking, not engineering. The other OS may decide to wait for three packets to send an Ack, which completely breaks your use of TCP_NODELAY to force one extra packet. "Wait for 3 packets" is just an example; there are many other valid algorithms to prevent Ack spam whch would not be fooled by your second one-byte dummy packet.

What is the real solution? Send a response at protocol level. No matter the OS then, it will piggyback the TCP Ack on your protocol response. In turn, this response will also force an Ack in the other direction (the response too is a TCP message) but you don't care about the latency of the response. The response is there just so the receiving OS piggybacks the first Ack.

MSalters
  • 173,980
  • 10
  • 155
  • 350
-1

I would suggest you leave the Nagle alogithm and buffers turned on, as its basic purpose is to collect small writes into full/larger packets (this improves performance a lot), but at the same time use FlushFileBuffers() on the socket after your are done sending for a while.

I assume here, that your game has some sort of a main loop, which processes stuff and then waits for amount of time before going into the next round:

while(run_my_game)
{
    process_game_events_and_send_data_over_network();
    Sleep(20 - time_spent_processing);
};

I would now suggest to insert FlushFileBuffers() before the Sleep() call:

while(run_my_game)
{
    process_game_events_and_send_data_over_network();
    FlushFileBuffers(my_socket);
    Sleep(20 - time_spent_processing);
};

That way, you delay sending pakets at latest to the moment before your application goes to sleep to wait for the next round. You should receive the performance benefit from Nagel's algorithm and minimize delay.

In case this doesn't work, it would be helpful if you post a bit of (pseudo-) code which explains how your program actually works.

EDIT: There were two more thing that came into my head when I thought about your question again:

a) Delayed ACK pakets should indeed NOT cause any lag, as they travel in the opposite direction of the data you are sending. They block at worst the sending queue. This however will be solved by TCP after a few pakets when the bandwith of the connection and memory limits permit it. So unless you machine has really low RAM (not enough to hold a bigger send queue), or you are really trasmitting more data than your connection allows, then delayed ACK pakets are an optimisation and will actually improve performance.

b) You are using a dedicated thread for sending. I wonder why. AFAIK is the Socket API multi-threading safe, thus every producting thread could send the data all by itself - unless your application requires such a queue, I would suggest to also remove this dedicated sending thread and with it the additional synchronisation overhead and delay it might cause.

I' specifically mentioning the delay here. As the operating system might decide to not immediatly schedule the send-thread for executiong again, when it becomes unblocked on its queue. Typicall re-scheduling delays are in the 10ms range, but under load they can skyrock to 50ms or more. As a workarround, you could try fiddeling with the scheduling priorities. But this will not reduce the delay imposed by the operating system itself.

Btw. you can easily benchmark TCP and your network, by just having one thread on the client and one on the server, that just play ping/pong with some data.

SIGSEGV
  • 412
  • 4
  • 5
-1

every packet I send needs to arrive (and be in order);

This requirement is the cause of your latency.

Either you have a network with negligible packet loss, in which UDP would be delivering every packet, or you have loss, in which TCP is doing retransmit, delaying everything by (multiples of) the retransmit interval (which is at least the round-trip time). This delay is not consistent, as it is triggered by lost packets; jitter usually has worse consequences than the predictable delay in acknowledgement caused by packet-combining

Only when packets can be discarded/unordered, can UDP be faster than TCP!

This is an easy assumption to make, but erroneous.

There are other ways to improve drop rates besides ARQ which provide lower latency: forward error correction methods achieve improved latency for drop recovery at the expense of additional bandwidth required.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Reason for downvote? You shouldn't downvote information just because you don't like it. If you think something is wrong, please offer a correction as a comment. – Ben Voigt Sep 15 '14 at 14:56
  • The Nagle algorithm wasn't being disabled -- that was the cause of the latency. It's not suppose to matter if you disable Nagle pre/post socket connect, but apparently that's changed on Windows with the release of Vista. There's not really a way to have known, since no exception is thrown or error returned, and `GetSocketOption` returns that Nagle had been disabled, even if it hadn't actually. – Mr. Smith Sep 17 '14 at 02:10
  • @Mr.Smith: Disabling Nagle won't eliminate latency caused by packet drops. My answer remains correct. – Ben Voigt Sep 17 '14 at 12:37
  • 2
    It's not answer to my question; please re-read my question throughly. The answer to the *higher than expected* latency is Nagle not properly disabling. This is what other TCP games in the genre do successfully, that I had not been. – Mr. Smith Sep 17 '14 at 12:52
  • @Mr.Smith: Maybe it is on one client. On another, it is packet loss in the network. Because your application is delay-sensitive, you must not use TCP. – Ben Voigt Sep 17 '14 at 12:55