Why a connection pool of many opened connections is less costly for a system than to open a new connection every time?

Question

An answer from here

Typically, opening a database connection is an expensive operation, so pooling keeps the connections active so that, when a connection is later requested, one of the active ones is used in preference to opening another one.

I understand the concept of Connection Pool in DB management. That is the answer for a "what is ~" question. All developers blog posts, answers, tutorials, DB docs out there always answer for a question "what is". Like they constantly copy/paste text from one another. Nobody tries to explain "why is it so" and "how". The answer above is an example to it.

I can't understand why and how it is possible that keeping, say, 30 opened connections in a pool is less costly for a system than to open a new connection when it is required.

Suppose, I have a web server located in Australia. And a DB in AWS located in USA somewhere in Oregon. Or in GB. Or anywhere but very far from AUS. So all they say that keeping a pool of 20-... opened connections would be less costly for a memory and system performance than to open a new connection every time in such a case? How it can be? Why?

establishing a connection requires comms (think handshakes) between the endpoints, memory allocations etc. Also, if you are performing a loop to perform some DB interaction, you are going to be far better off re-using the same active connection than opening and closing within the loop; even if there is only a small difference for making a single connection, it will multiply up in a loop. — Mitch Wheat, May 20 '17 at 03:16
Green, for what it's worth, I've added extra information to the answer that prompted this question: see http://stackoverflow.com/questions/4041114/what-is-database-pooling/4041136#4041136. But Mitch and Maria are basically right, establishing a connection is not an instantaneous process. — paxdiablo, May 20 '17 at 04:38

score 9 · Answer 1 · answered May 20 '17 at 03:19

In your scenario, I think the biggest problem is network latency. You can't expect communication between servers located in two different continents to be particularly fast. So if you initiate a new connection every time you need one, you'd experience this latency every single time.

From here:

Connecting to a database server typically consists of several time-consuming steps. A physical channel such as a socket or a named pipe must be established, the initial handshake with the server must occur, the connection string information must be parsed, the connection must be authenticated by the server, checks must be run for enlisting in the current transaction, and so on.

Also, if the connection between the webserver and the database uses SSL/TLS, there's a handshake that must be performed on every new connection before the actual communication can occur (in addition to the normal handshake that ocurrs in normal connections). This handshake is expensive in terms of time.

From here:

Before the client and the server can begin exchanging application data over TLS, the encrypted tunnel must be negotiated: the client and the server must agree on the version of the TLS protocol, choose the ciphersuite, and verify certificates if necessary. Unfortunately, each of these steps requires new packet roundtrips between the client and the server, which adds startup latency to all TLS connections. (...) As the above exchange illustrates, new TLS connections require two roundtrips for a "full handshake"—that’s the bad news

When you use connection pooling, this overhead is avoided by regularly sending something like a 'ping' message to the SQL server from time to time, to avoid the connection from timeout due to inactivity. Sure, this might consume more memory in your server, but nowadays that's a far less expensive resource. Latency in networks is unavoidable.

Why a connection pool of many opened connections is less costly for a system than to open a new connection every time?

1 Answers1

Linked