2

I am trying to come up with a java implementation of a simple HTTP client that keeps a socket open and reuses it to query other (or same) URLs on the same host.

I have a simple implementation that uses java.net.Socket but somehow the performance when I keep the socket open is worse than when I keep creating a new one.

Results first, full executable code below:

With KeepAlive: slower starting at iteration #2

> java -server -Xms100M -Xmx100M -cp . KeepAlive 10 true
--- Warm up ---
18
61
60
60
78
62
59
60
59
60
Total exec time: 626
--- Run ---
26
59
60
61
60
59
60
60
62
58
Total exec time: 576

Recreating the socket every time gives better results:

> java -server -Xms100M -Xmx100M -cp . KeepAlive 10 false
--- Warm up ---
188
34
39
33
33
33
33
33
34
33
Total exec time: 494
--- Run ---
33
35
33
34
44
34
33
34
32
34
Total exec time: 346

KeepAlive.java (standalone, no dependencies)

import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.InputStreamReader;
import java.net.InetSocketAddress;
import java.net.Socket;

public class KeepAlive {

    private static final String NL = "\r\n";
    private static final int READ_SIZE = 1000;
    private Socket socket;
    private DataOutputStream writer;
    private BufferedReader reader;

    public static void main(String[] args) throws Exception {
        if (args.length == 2) {
            KeepAlive ka = new KeepAlive();
            System.out.println("--- Warm up ---");
            ka.query(Integer.parseInt(args[0]), args[1].equals("true"));
            System.out.println("--- Run ---");
            ka.query(Integer.parseInt(args[0]), args[1].equals("true"));
        } else {
            System.out.println("Usage: keepAlive <n queries> <reuse socket>");
        }
    }

    private void query(int n, boolean reuseConnection) throws Exception {
        long t0 = System.currentTimeMillis();
        if (reuseConnection) {
            open();
            for (int i = 0; i < n; i++) {
                long tq0 = System.currentTimeMillis();
                query();
                System.out.println(System.currentTimeMillis() - tq0);
            }
            close();
        } else {
            for (int i = 0; i < n; i++) {
                long tq0 = System.currentTimeMillis();
                open();
                query();
                close();
                System.out.println(System.currentTimeMillis() - tq0);
            }
        }
        System.out.println("Total exec time: " + (System.currentTimeMillis() - t0));
    }

    private void open() throws Exception {
        socket = new Socket();
        socket.setKeepAlive(false);
        socket.connect(new InetSocketAddress("example.org", 80));
        writer = new DataOutputStream(socket.getOutputStream());
        reader = new BufferedReader(new InputStreamReader(socket.getInputStream()));
    }

    private void query() throws Exception {
        StringBuilder req = new StringBuilder();
        req.append("GET / HTTP/1.1").append(NL);
        req.append("Host: example.org").append(NL);
        req.append("Connection: Keep-Alive").append(NL);
        req.append(NL);
        String reqStr = req.toString();

        long t0 = System.currentTimeMillis();
        writer.writeBytes(reqStr);
        writer.flush();

        String line;
        int contentLength = 0;
        while ((line = reader.readLine()) != null) {
            if (line.startsWith("Content-Length: ")) {
                contentLength = Integer.parseInt(line.substring(16));
            }
            if (line.equals("")) {
                char[] buf = new char[contentLength];
                int offset = 0;
                while (offset < contentLength) {
                  int len = contentLength - offset;
                  if (len > READ_SIZE) {
                    len = READ_SIZE;
                  }
                  int ret = reader.read(buf, offset, len);
                  if (ret == -1) {
                    System.out.println("End of stream. Exiting");
                    System.exit(1);
                  }
                  offset += ret;
                }

                break;
            }
        }
    }

    private void close() throws Exception {
        writer.close();
        reader.close();
        socket.close();
    }
}

Now, I'm pretty sure that either:

  1. the web server sucks at handling the new requests fast (HTTP Keep Alive and TCP keep alive)

  2. something is wrong with the way I use the buffered reader because that's where all the time is lost but looking at the other methods available (and I tried a few), I can't find what I need to do to fix this...

Any idea how I could make this work faster? Maybe a config to change on the server itself?...


Solution

As explained by apangin below, the slower perf is caused by Nagle's algorithm, which is enabled by default. Using setTcpNoDelay(true), I get the updated following perfs:

Without keep-alive:

java -server -Xms100M -Xmx100M -cp . KeepAlive 10 false
--- Warm up ---
49
22
25
23
23
22
23
23
28
28
Total exec time: 267
--- Run ---
31
23
23
24
25
22
23
25
33
23
Total exec time: 252

With keep-alive:

java -server -Xms100M -Xmx100M -cp . KeepAlive 10 true
--- Warm up ---
13
12
12
14
11
12
13
12
11
12
Total exec time: 168
--- Run ---
14
12
11
12
11
12
13
11
21
28
Total exec time: 158

So here, we can see the keep-alive version performing far better than the non keep-alive one for each iteration and also if comparing total execution times. :)

Community
  • 1
  • 1
fabien
  • 1,529
  • 1
  • 15
  • 28
  • you aren't really testing like for like here. In one you are hammering the server as hard as you can. On the other you are pausing quite a bit between each query. Have you tried testing total time on the client for each run? – BevynQ May 17 '16 at 01:52
  • Yes, that is the point of the test: seeing which of the two different behaviors perform the best. The question being: "how fast can I make a request and get the response with or without keep-alive". I added the total time of execution for you tho. – fabien May 17 '16 at 14:42

2 Answers2

8

That's the effect of Nagle's algorithm. It delays sending TCP packets in anticipation of more outgoing data.

Nagle's algorithm interacts badly with TCP delayed acknowledgment in write-write-read scenarios. This is exactly your case, because writer.writeBytes(reqStr) sends a string byte-by-byte.

Now you have two options to fix the behavior:

  1. use socket.setTcpNoDelay(true) to disable Nagle's algorithm;
  2. send the complete request in one operation: writer.write(reqStr.getBytes());

In both cases the reused connection will expectedly work faster.

apangin
  • 92,924
  • 10
  • 193
  • 247
  • Why wouldn't that apply to both connection types? – user207421 May 20 '16 at 00:46
  • @EJP Obviously, `socket.close()` forces all outstanding data to be sent immediately. This does not happen for a socket kept open. – apangin May 20 '16 at 00:48
  • That is neither obvious nor correct, and the request still has to be sent without closing the socket. All that `close()` does as far as data is concerned is queue up a FIN to be sent after the pending data. It does not turn off the [Nagle algorithm](https://tools.ietf.org/html/rfc896). – user207421 May 20 '16 at 01:43
  • @EJP If you carefully read the link you've posted, you'll realize that Nagle's algorithm delays transmission if previously sent data remains unacknowledged. This works poorly with [delayed ACK mechanism](https://en.wikipedia.org/wiki/TCP_delayed_acknowledgment). `close()` breaks this sequence: a new connection won't have unacknowledged data, and a new request will be sent almost immediately. – apangin May 20 '16 at 14:47
  • I've verified with `tcpdump` that client indeed delays sending the second request until the server's `ACK` arrives. This does not happen on a socket with `TCP_NODELAY` option. – apangin May 20 '16 at 14:47
  • 2
    Will you justify your delete vote (other than by personal antipathy)? – apangin May 20 '16 at 14:52
  • Thanks apangin, setting setTcpNoDelay to true (and disabling Nagle's algorithm) improved the performance of both of the tests I did. Now I get faster and equal performance for both tests (about 10ms per request-response). I still am not sure why the keep-alive wouldn't perform better than opening/closing sockets for each request though (equal performance is already some progress compared to what i had tho). – fabien May 22 '16 at 02:41
  • I am familiar with the Nagle algorithm, thanks. Iif you read the link I posted you will realize that the words 'close' and FIN don't occur anywhere in it, or any words to the effect of 'forces all outstanding data to be sent immediately'. – user207421 May 22 '16 at 04:34
  • @fabien Your request-response measurements do not include `open` and `close`, which make the essential difference between two tests. Does the total execution time differ? I observe a big improvement in total time with a keep-alive connection. Please also try other hosts. – apangin May 22 '16 at 11:41
-2
reader.read(buf);

Your test is invalid. You aren't necessarily reading the entire response. You need to change this to a loop that counts the data returned. And if you don't read the entire response, you will be getting out of sync in the keep-alive case.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • You are not wrong. The payload tested here is so small though (about ~1Kb) it doesn't matter. I added the loop to the code above and updated the results. Same, as expected. – fabien May 17 '16 at 14:44