13

I am using JDBC to talk to my Postgres database. If my entire app runs off a single connection, ie there is only ever one call to;

DriverManager.getConnection("jdbc:postgresql://host:5432/database", user, pass);

But this Connection object is shared across multiple threads in the Java, am I right in assuming that any attempt to use SQL transactions (BEGIN and COMMIT style) is only going to be very confusing and broken, given the potential for the Java threads to interleave? Does the Connection object 'know' which Java thread is using it to make queries?

Should I have one Connection object per Java thread and use the SQL transactions that way? Or should I perform all my transactional isolation in the Java using synchronized?

lynks
  • 5,599
  • 6
  • 23
  • 42

3 Answers3

15

Just to elaborate on the existing answers:

PgJDBC's Connection object is thread-safe, but only on a statement level. It won't crash or produce wrong result when used by multiple threads in autocommit mode but it won't isolate different threads' transactions for you. As per the documentation you need to use a connection pool for that.

There are actually lots of ways to use connections among multiple threads:

  • Use an internal connection pool where you fetch connections from, perform work with them, and return them to the pool. This is the strongly preferable option for most applications. Many JDBC connection pool implementations exist for Java, so don't roll your own. dbcp and c3p0 are two popular implementations, but if you're using a servlet environment or app server you should generally use the server's connection pool rather than bringing your own.

  • Use an external connection pool like pgbouncer or pgpool-II and open/close connections to it freely. This is slightly slower and is mostly an option used where the application cannot or for various reasons should not pool connections internally. You probably don't need to do this unless you need to limit total connection counts to the DB and share them between multiple applications or app instances.

  • Use no pool and open/close connections freely. This is terribly inefficient. Don't do it.

  • Keep a connection per thread using thread local storage. This'll work, but it's grossly inefficient because each open connection ties up database server resources while it sits idle. Don't do this unless you use an external connection pool like PgBouncer in transaction pooling mode, in which case it's OK.

  • Use only a single connection and wrap transactions in synchronized blocks, synchronizing on the Connection instance. This'll work and will use the database connection efficiently but will limit your threads' performance. It's generally not a good design for anything except toy/convenience apps.

  • Use only a single connection with its own dedicated thread. Have other connections pass data structures describing work to be done to that thread via a FIFO queue, producer/consumer style. This works if the threads spend most of their time doing CPU-heavy or other non-database work and need only limited database access. Pretty much the only reason to use it instead of using a connection pool is if you're constrained to using a single connection for some external reason, but if you are then it can be a decent option.

In general, though, you should just use a connection pool and be done with it.

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • Fantastically thorough thank you, you have confirmed much of what I believed but had never been told. – lynks Mar 27 '13 at 12:14
5

Am I right in assuming that any attempt to use SQL transactions (BEGIN and COMMIT style) is only going to be very confusing and broken, given the potential for the Java threads to interleave?

This is absolutely right.

Does the Connection object 'know' which Java thread is using it to make queries?

No, it does not.

Should I have one Connection object per Java thread and use the SQL transactions that way?

Yes, this is one way of doing it. The downside to the "connection per thread" allocation is a potential to open more connections than you need, leading to sub-optimal use of resources. You can also open a connection only when the thread needs it, and close it once thread is done with the RDBMS access. If you go this route, make sure that you use a connection pool to reduce the overhead of re-opening connections multiple times.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • Thanks, I think I have it right in my head, I have an app that appears to mix java transaction isolation methods with sql methods, and sometimes a connection is spawned mid-thread, other times a global connection is used. I will move all of this to a connection pool :) – lynks Mar 26 '13 at 16:28
  • 1
    Minor nit-pick: Actually, the Java `Connection` object *does* know which thread is using it to make queries at any given time; `Thread.currentThread()`. It just doesn't care and it expects the threads to co-ordinate their work themselves. It would be entirely possible to write a Connection that used multiple underlying PostgreSQL connections to do thread-local sessions, but needlessly complex when you can do the same thing using individual Connection objects and existing thread-local mechanisms - or just use a connection pool. – Craig Ringer Mar 26 '13 at 23:43
2
Does the Connection object 'know' which Java thread is using it to make queries?

No, connection object does not know which java thread is using it.

Should I have one Connection object per Java thread and use the SQL transactions that way? Or should I perform all my transactional isolation in the Java using synchronized?

We should use Database specific jdbc connection pool datasource to do transaction, so that when a connection object is of no use, then the connection will go back to the pool without being garbage collected. In that way, the application server can optimize it's connection initialisation performance.

Also, you should use synchronised method call during Update Operation because that will make much safer operation in production.

Piyas De
  • 1,786
  • 15
  • 27
  • Hi, thanks for the answer. Could you clarify what you mean by the last bit? Is it not enough to wrap my query-set in `BEGIN` and `COMMIT` queries? What is the reason for adding a Java isolation mechanism on top? – lynks Mar 26 '13 at 16:39
  • Except Oracle Database, no other database does not do row-level locking while transaction by default. So it is safe to use synchronised method, because here all update will be processed in queue. Thanks – Piyas De Mar 26 '13 at 16:42