2

The following HttpServer program easily handles 8000 requests/s without HTTP keepalive, but a measly 22 requests/s with keepalive.

import java.io.IOException;
import java.io.OutputStream;
import java.net.InetSocketAddress;

import com.sun.net.httpserver.HttpExchange;
import com.sun.net.httpserver.HttpHandler;
import com.sun.net.httpserver.HttpServer;

public class HSTest {
    public static void main(String[] args) throws IOException {
        HttpServer hs = HttpServer.create(new InetSocketAddress(30006), 1000);
        hs.createContext("/", new HttpHandler() {
            public void handle(HttpExchange he) throws IOException {
                byte[] FILE = "xxxxx".getBytes();       
                he.sendResponseHeaders(200, FILE.length);           
                OutputStream os = he.getResponseBody();
                os.write(FILE);
                os.flush();
                he.close();
            }
        });
        hs.start();
    }
}

Here's how it looks with keepalive:

Wireshark screenshot with HTTP keepalive

Note the huge delays at packets 6, 12 and 17. After the first one, they're always just a little bit over 40ms. In contrast, without keepalive everything's fine:

Wireshark screenshot without HTTP keepalive

That's 3 whole requests before the first ms is over!

I'm using OpenJDK 8 on debian sid Linux amd64, with both client and server on the same machine and communicating via a localhost interface. To test, I'm using ab -n 100000 -c 1 http://localhost:30006/ (no keepalive) and ab -n 100000 -c 1 -k http://localhost:30006/ (keepalive), as well as curl and chromium (both with keepalive by default).

So what is causing the 40ms delay with HTTP keepalive requests, and how do I make my server fast?

phihag
  • 278,196
  • 72
  • 453
  • 469
  • I think you should tell us more about the client setup you use to test this. In the KeepAlive scenario, you seem to have a single source port for all requests. In the non-KeepAlive, not only do you have new connection for each request, but the connection attempts are interleaved. What kind of test environment do you use? There are reasons for why 100 fixed threads could be held busy by KeepAlive used incorrectly, for example, but that depends highly on the client behavior. – cnettel Feb 23 '17 at 21:15
  • @cnettel Umm, the definition of keepalive is that you use one TCP connection for multiple requests. Therefore, the client port is always the same! Also, I don't see interleaved connection attempts - can you elaborate where you do see them? I see one connection from port 59712 created, used and destroyed (packets 1-12), then one from port 59714 (packets 13-24), then one from 59716 (packets 25-). Nevertheless, I added information about how I test: using `ab` and `curl`. – phihag Feb 23 '17 at 21:27
  • You need to send a content-length header in the response. – user207421 Feb 23 '17 at 23:44
  • @EJP I am under the impression that I am setting one in the line `he.sendResponseHeaders(200, FILE.length);`. Looking at the packet details in wireshark or the output of `curl -v http://localhost:30006/` shows that it's there already. Can you verify that setting a `Content-Length` header fixes the problem? If so, please do post an answer. – phihag Feb 24 '17 at 00:14
  • @pihag Well, that's my point. Even if you use KeepAlive, you could very well have multiple clients active. Testing throughput in terms of what a single-threadad client is able to hit the server with is a rather odd scenario, in my mind. When looking closer now, I was wrong about the interleaving. – cnettel Feb 24 '17 at 10:05

1 Answers1

4

Like hinted in the comments, I think the main cause of concern here is that it is not "normal" to require HTTP throughput to be extremely high over a single connection (without tweaking away from default settings). If you would get similarly disastrous numbers when allowing multiple clients (e.g. the -c 100 flag to ab), that would be a different issue. KeepAlive overall has the effect of hogging threads on one-thread-per-connection servers.

I think what you are observing is related to TCP_NODELAY (Nagle's algorithm), possibly accompanied by "delayed acks". The no keepalive case is short enough in terms of the number of packets that you are never hit by it.

https://eklitzke.org/the-caveats-of-tcp-nodelay specifically mentions delays of "up to 40 ms" on Linux http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7068416 mentions a Java property for enabling TCP_NODELAY within the basic Java HTTP server. I am quite confident that you'll see different behavior if enabling this flag.

Another avenue would be changing the delayed ack timeout, to something different than 40 ms. See e.g. https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/1.3/html/Realtime_Tuning_Guide/sect-Realtime_Tuning_Guide-General_System_Tuning-Reducing_the_TCP_delayed_ack_timeout.html

cnettel
  • 1,024
  • 5
  • 7
  • Thank you very much! Indeed, starting java with `-sun.net.httpserver.nodelay=true` works around the problem (30k requests/s with keepalive, 8k without). Surely an HTTP server should flush after sending the full response; it's not as if any traffic will come from either side until the response has been sent. I am now looking into options to force this behavior in code instead of JVM configuration. – phihag Feb 24 '17 at 10:24
  • 1
    The simplest way to force the system property programmatically is adding the following line as the topmost line of the `main` method: `System.setProperty("sun.net.httpserver.nodelay", "true");` Another way is to set static variable `noDelay` in `sun.net.httpserver.ServerConfig` to `true` using reflections if it has been already initialized and changing the system property has no effect. – Oleg Kurbatov Feb 28 '17 at 09:58
  • @OlegKurbatov Thanks! I awarded the bounty to this answer and accepted it. – phihag Mar 02 '17 at 09:02