According to this Socket FAQ article, Nagle's algorithm is one of many algorithms that can cause a bunch of data to sit in the TCP buffer and not hit the wire. The delay from the Nagle algorithm can be up to 200ms.
For some reason, Nagle's algorithm can be turned off completely, but not flushed just once. This is really puzzling to me. Why is there no way to say that "just this one time, don't wait for any more data. Just act as if Nagle's 200ms are up."
Wouldn't that make perfect sense, and strike a good balance between no Nagle at all, Nagle all the time, and implementing one's own protocol from scratch?