2

If we use any connection pooling framework or Tomcat JDBC pool then how much it is costly to open and close the DB connection?

Is it a good practice to frequently open and close the DB connection whenever DB operations are required?

Or same connection can be carried across different methods for DB operations?

catch23
  • 17,519
  • 42
  • 144
  • 217
Avyaan
  • 1,285
  • 6
  • 20
  • 47
  • If performance is significatn to you, I would consider switching to Hikari connection pool that is 2x-10x faster than Tomcat's default one. See https://www.baeldung.com/hikaricp – The Impaler Jun 30 '22 at 17:00

6 Answers6

3

Jdbc Connection goes through the network and usually works over TCP/IP and optionally with SSL. You can read this post to find out why it is expensive.

You can use a single connection across multiple methods for different db operations because for each DB operations you would need to create a Statement to execute.

Connection pooling avoids the overhead of creating Connections during a request and should be used whenever possible. Hikari is one of the fastest.

shazin
  • 21,379
  • 3
  • 54
  • 71
  • If DB and application are on the same machine. No network latency case – Avyaan Jul 01 '22 at 16:35
  • 2
    @Avyaan – If the RDBMS server and the application are running on the same iron, the network latency is reduced, but it is still there. This means that the cost for a new Database Connection are reduced in this case, but they are still existing. Only when you embed the database into the application, the network latency is gone, because you access the database now with direct Java method calls. – tquadrat Jul 10 '22 at 15:36
2

The answer is - its almost always recommended to re-use DB Connections. Thats the whole reason why Connection Pools exist. Not only for the performance, but also for the DB stability. For instance, if you don't limit the number of connections and mistakenly open 100s of DB connections, the DB might go down. Also lets say if DB connections don't get closed due to some reason (Out of Memory error / shut down / unhandled exception etc), you would have a bigger issue. Not only would this affect your application but it could also drag down other services using the common DB. Connection pool would contain such catastrophes.

What people don't realize that behind the simple ORM API there are often 100s of raw SQLs. Imagine running these sqls independent of connection pools - we are talking about a very large overhead.

I couldn't fathom running a commercial DB application without using Connection Pools.

Some good resources on this topic:
https://www.cockroachlabs.com/blog/what-is-connection-pooling/ https://stackoverflow.blog/2020/10/14/improve-database-performance-with-connection-pooling/

Tintin
  • 2,853
  • 6
  • 42
  • 74
0

Whether the maintenance (opening, closing, testing) of the database connections in a DBConnection Pool affects the working performance of the application depends on the implementation of the pool and to some extent on the underlying hardware.

A pool can be implemented to run in its own thread, or to initialise all connections during startup (of the container), or both. If the hardware provides enough cores, the working thread (the "business payload") will not be affected by the activities of the pool at all.

Other connection pools are implemented to create a new connection only on demand (a connection is requested, but currently there is none available in the pool) and within the thread of the caller. In this case, the creation of that connection reduces the performance of the working thread – this time! It should not happen too often, otherwise your application needs too many connections and/or does not return them fast enough.

But whether you really need a Database Connection Pool at all depends from the kind of your application!

If we talk about a typical server application that is intended to run forever and to serve a permanently changing crowd of multiple clients at the same time, it will definitely benefit from a connection pool.

If we talk about a tool type application that starts, performs a more or less linear task in a defined amount of time, and terminates when done, then using a connection pool for the database connection(s) may cause more overhead than it provides advantages. For such an application it might be better to keep the connection open for the whole runtime.

Taking the RDBMS view, both does not make a difference: in both cases the connections are seen as open.

tquadrat
  • 3,033
  • 1
  • 16
  • 29
0

If you have performance as a key parameter then better to switch to the Hikari connection pool. If you are using spring-boot then by default Hikari connection pool is used and you do not need to add any dependency. The beautiful thing about the Hikari connection pool is its entire lifecycle is managed and you do not have to do anything. Also, it is always recommended to close the connection and let it return to the connection pool so that other threads can use it, especially in multi-tenant environments. The best way to do this is using "try with resources" and that connection is always closed.

    try(Connection con = datasource.getConnection()){
         // your code here.

        } 

To create your data source you can pass the credentials and create your data source for example:

    DataSource dataSource =  DataSourceBuilder.create()
            .driverClassName(JDBC_DRIVER)
            .url(url)
            .username(username)
            .password(password)
            .build();

Link: https://github.com/brettwooldridge/HikariCP

Ritesh Singh
  • 181
  • 2
  • 12
0

If you want to know the answer in your case, just write two implementations (one with a pool, one without) and benchmark the difference.

Exactly how costly it is, depends on so many factors that it is hard to tell without measuring

But in general, a pool will be more efficient.

Rob Audenaerde
  • 19,195
  • 10
  • 76
  • 121
0

The costly is always a definition of impact.

Consider, you have following environment.

A web application with assuming a UI-transaction (user click) and causes a thread on the webserver. This thread is coupled to one connection/thread on the database

  • 10 connections per 60000ms / 1min or better to say 0.167 connections/s
  • 10 connections per 1000ms / 1sec => 10 connections/s
  • 10 connections per 100ms / 0.1sec => 100 connections/s
  • 10 connections per 10ms / 0.01sec => 1000 connections/s

I have worked in even bigger environments. And believe me the more you exceed the 100 conn/s by 10^x factors the more pain you will feel without having a clean connection pool. The more connections you generate in 1 second the higher latency you generate and the higher impact is it for the database. And the more bandwidth you will eat for recreating over and over a new "water pipeline" for dropping a few drops of water from one side to the other side.

Now getting back, if you have to access a existing connection from a connection pool it is a matter of micros or few ms to access the database connection. So considering one, it is no real impact at all. If you have a network in between, it will grow to probably x10¹ to x10² ms to create a new connection. Considering now the impact on your webserver, that each user blocks a thread, memory and network connection it will impact also your webserver load. Typically you run into webserver (e.g. revProxy apache + tomcat, or tomcat alone) thread pools issues on high load environments, if the connections get exhausted or they need too long time (10¹, 10² millis) to create

Now considering also the database.

If you have open connection, each connection is typically mapped to a thread on a DB. So the DB can use thread based caches to make prepared statements and to reuse pre-calculated access plan to make the accesses to data on database very fast. You may loose this option if you have to recreate the connection over and over again.

But as said, if you are in up to 10 connections per second you shall not face any bigger issue without a connection pool, except the first additional delay to access the DB. If you get into higher levels, you will have to manage the resources better and to avoid any useless IO-delay like recreating the connection.

Experience hints:

it does not cost you anything to use a connection pool. If you have issues with the connection pool, in all my previous performance tuning projects it was a matter of bad configuration.

You can configure

  • a connection check to check the connection (use a real SQL to access a real db field). so on every new access the connection gets checked and if defective it gets kicked from the connection pool
  • you can define a lifetime of a connections, so that you get new connection after a defined time

=> all this together ensure that even if your admins are doing crap and do not inform you (killing connection / threads on DB) the pool gets quickly rebuilt and the impact stays very low. Read the docs of the connection pool.

Is one connection pool better as the other?

A clear no, it is only getting a matter if you get into high end, or into distributed environments/clusters or into cloud based environments. If you have one connection pool already and it is still maintained, stick to it and become a pro on your connection pool settings.

cilap
  • 2,215
  • 1
  • 25
  • 51