58

Keep-alives were added to HTTP to basically reduce the significant overhead of rapidly creating and closing socket connections for each new request. The following is a summary of how it works within HTTP 1.0 and 1.1:

HTTP 1.0 The HTTP 1.0 specification does not really delve into how Keep-Alive should work. Basically, browsers that support Keep-Alive appended an additional header to the request as [edited for clarity] explained below:

When the server processes the request and generates a response, it also adds a header to the response:

Connection: Keep-Alive

When this is done, the socket connection is not closed as before, but kept open after sending the response. When the client sends another request, it reuses the same connection. The connection will continue to be reused until either the client or the server decides that the conversation is over, and one of them drops the connection.

The above explanation comes from here. But I don't understand one thing

When this is done, the socket connection is not closed as before, but kept open after sending the response.

As I understand we just send tcp packets to make requests and responses, how this socket connection helps and how does it work? We still have to send packets, but how can it somehow establish the persistent connection? It seems so unreal.

Razzle
  • 479
  • 4
  • 11
good_evening
  • 21,085
  • 65
  • 193
  • 298
  • 1
    @JakeGould: Thanks for edit. I think it's appropriate, similar questions have been asked before. For example: http://stackoverflow.com/questions/1480329/what-exactly-does-a-persistent-connection-mean, but they don't explain how does this `socket connection` actually works. – good_evening Dec 24 '13 at 16:38
  • @good_evening That's an old question, back than such question were okay. As JakeGould suggested you can try Server Fault or [networkengineering.se]. – Bleeding Fingers Dec 24 '13 at 16:45
  • 6
    I think this is more appropriate here than on server fault since it just asks for explanations of the technology which can help programmers. – Igor Čordaš Mar 18 '14 at 11:07
  • 2
    @good_evening Sounds pretty self explanatory to me. The HTTP model works like a phone call to your friend. You call him and ask question, he answers and then you hang up. You repeat this for every question. Keep alive means you don't hang up the phone and either of you just talk whenever you have something to talk about. I suppose another example would be that you log into a website and it saves your credentials so you don't have to keep logging in every time you go to it. – The Muffin Man May 25 '15 at 08:19

5 Answers5

91

There is overhead in establishing a new TCP connection (DNS lookups, TCP handshake, SSL/TLS handshake, etc). Without a keep-alive, every HTTP request has to establish a new TCP connection, and then close the connection once the response has been sent/received. A keep-alive allows an existing TCP connection to be re-used for multiple requests/responses, thus avoiding all of that overhead. That is what makes the connection "persistent".

In HTTP 0.9 and 1.0, by default the server closes its end of a TCP connection after sending a response to a client. The client must close its end of the TCP connection after receiving the response. In HTTP 1.0 (but not in 0.9), a client can explicitly ask the server not to close its end of the connection by including a Connection: keep-alive header in the request. If the server agrees, it includes a Connection: keep-alive header in the response, and does not close its end of the connection. The client may then re-use the same TCP connection to send its next request.

In HTTP 1.1, keep-alive is the default behavior, unless the client explicitly asks the server to close the connection by including a Connection: close header in its request, or the server decides to includes a Connection: close header in its response.

BornToCode
  • 9,495
  • 9
  • 66
  • 83
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • 2
    Thanks. That's what I didn't quite understand, why TCP connection is difficult to create. What else besides `DNS lookups, TCP handshake, SSL/TLS handshake` needs to be done? – good_evening Dec 24 '13 at 17:01
  • 16
    Nothing. But by themselves, DNS, TCP, and SSL are not lightweight systems, each one of them takes time and resources to perform their respective steps before the next one can be performed. DNS has to resolve a hostname to an IP address before a TCP connection can be made. TCP has to perform a three-way handshake to make a new connection before an SSL session can be made. SSL involves multiple handshakes to exchange crypto info back and forth. The fewer times you have to perform those steps between a given client/server pair, the faster HTTP requests can be sent and responded to. – Remy Lebeau Dec 24 '13 at 17:30
  • What I get from the answer is if the flags are present in the headers, the connection stays alive. I don't get who ultimately decides when the connection closes. I don't get how the server sends a response to the client and signals the end of a response without closing the connection and forcing an EOF which would normally signal the end of a response in a connection:close situation. All I get from this answer is that HTTP 1.1 defaults to tcp keep alive and previous versions need it specified. I get no answers on how this applies to HTTP keep-alive. –  Apr 27 '15 at 17:31
  • @RemyLebeau I'm not trying to be difficult. I came to this answer while writing a proxy because even though I transparently hand data back and forth between client and server, even when I reach the end of a response and send it back to the client, the client just sits there holding the connection open and not asking for anything else. All of the data is successfully transferred. Since it's tcp-keep alive, I never get nor send an EOF. If I force close, the client reacts finally because of EOF. So I must be missing something and this answer offers no insight into the required machanics. –  Apr 27 '15 at 17:34
  • 2
    @TechnikEmpire: please read the HTTP spec, particularly [RFC 2616 section 4.4](http://tools.ietf.org/html/rfc2616#section-4.4). There are several ways to signal the end of an HTTP message without closing the connection. The client must analyze the server's response headers to find out which mechanism is actually being used. And if you are writing your own proxy, you have to be careful that you are not messing them up. You cannot blindly send everything as-is, you might have to analyze/adjust the HTTP messages as keep-alives are handled on a per-connection basis and a proxy has 2 connections. – Remy Lebeau Jul 09 '15 at 22:55
  • @RemyLebeau thanks for the links, I am aware of the things you outlined in the RFC. I think perhaps the issues I'm facing stem from improperly handling keep-alive behavior pipelined transactions. I'm going to do a rewrite of my parsing methods tonight and I believe this will solve the issue. –  Jul 10 '15 at 01:27
27

Let's make an analogy. HTTP consists in sending a request and getting the response. This is similar to asking someone a question, and receiving a response.

The problem is that the question and the answer need to go through the network. To communicate through the network, TCP (sockets) is used. That's similar to using the phone to ask a question to someone and having this person answer.

HTTP 1.0 consists, when you load a page containing 2 images for example, in

  • make a phone call
  • ask for the page
  • get the page
  • end the phone call
  • make a phone call
  • ask for the first image
  • get the first image
  • end the phone call
  • make a phone call
  • ask for the second image
  • get the second image
  • end the phone call

Making a phone call and ending it takes time and resources. Control data (like the phone number) must transit over the network. It would be more efficient to make a single phone call to get the page and the two images. That's what keep-alive allows doing. With keep-alive, the above becomes

  • make a phone call
  • ask for the page
  • get the page
  • ask for the first image
  • get the first image
  • ask for the second image
  • get the second image
  • end the phone call
JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255
  • 1
    Thanks, yes, I understand this. But how can we actually make it work only with `a single phone call`? What happens between a client and a server in the first stage (`make a phone call`)? – good_evening Dec 24 '13 at 17:06
  • First stage: http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Connection_establishment. What happens after is dictated by the HTTP protocol. The server expects an HTTP request and answers with an HTTP response. Then it expects another request, etc. – JB Nizet Dec 24 '13 at 17:55
21

This is is indeed networking question, but it may be appropriate here after all.

The confusion arises from distinction between packet-oriented and stream-oriented connections.

Internet is often called "TCP/IP" network. At the low level (IP, Internet Protocol) the Internet is packet-oriented. Hosts send packets to other hosts.

However, on top of IP we have TCP (Transmission Control Protocol). The entire purpose of this layer of the internet is to hide the packet-oriented nature of the underlying medium and to present the connection between two hosts (hosts and ports, to be more correct) as a stream of data, similar to a file or a pipe. We can then open a socket in the OS API to represent that connection, and we can treat that socket as a file descriptor (literally an FD in Unix, very similar to file HANDLE in Windows).

Most of the rest of Internet client-server protocols (HTTP, Telnet, SSH, SMTP) are layered on top of TCP. Thus a client opens a connection (a socket), writes its request (which is transmitted as one or more pockets in the underlying IP) to the socket, reads the response from a socket (and the response can contain data from multiple IP packets as well) and then... Then the choice is to keep the connection open for the next request or to close it. Pre-KeepAlive HTTP always closed the connection. New clients and servers can keep it open.

The advantage of KeepAlive is that establishing a connection is expensive. For short requests and responses it may take more packets than the actual data exchange.

The slight disadvantage may be that the server now has to tell the client where the response ends. The server cannot simply send the response and close the connection. It has to tell the client: "read 20KB and that will be the end of my response". Thus the size of the response has to be known in advance by the server and communicated to the client as part of higher-level protocol (e.g. Content-Length: in HTTP). Alternatively, the server may send a delimiter to specify the end of the response - it all depends on the protocol above TCP.

  • 3
    HTTP has multiple ways to terminate a response, depending on the format of the response (plain vs chunked vs MIME). A `Content-Length` is not always used/possible. – Remy Lebeau Dec 24 '13 at 17:33
  • 2
    I really like this answer it is not a best fit for the question but what you wrote in your answer is just what I was looking for. And to make this not just a thank you comment yes Content-Length is not always applicable but there are other ways to tell the client where the response ends and that is what is important since you lose "Read all then end" possibility by keeping connection open. – Igor Čordaš Mar 18 '14 at 11:13
2

You can understand it this way:

HTTP uses TCP as transport. Before sending and receiving packets via TCP,

  1. Client need to send the connect request
  2. The server responds
  3. Data transfer transfer is done
  4. Connection is closed.

However if we are using keep-alive feature, the connection is not closed after receiving the data. The connection stays active.

This helps improving performance as for the next calls, the Connect establishment will not take place as the connection to the server is already there. This means less time taken. Although time takes in connecting is small but it do make a lot of difference in systems where every ms counts.

Giacomo1968
  • 25,759
  • 11
  • 71
  • 103
Hardeep Singh
  • 1,211
  • 2
  • 10
  • 11
0

Keep-alive exemplifies The Law of Leaky Abstractions. While HTTP is intentionally designed as a stateless protocol, it is built upon TCP, which is inherently stateful. As a result, we must make certain compromises to prevent performance drawbacks.

ibrahim koz
  • 537
  • 4
  • 15