2

In order to connect to my Oracle DB, I use the thin Oracle JDBC driver and the tomcat JDBC connection pool.

The default setting tomcat-pool is not to touch the connection when it is returned to the pool (any associated transaction is not committed or rolled back, and the actual connection is not closed).

The default transaction isolation level of the Oracle driver is READ_COMMITTED, and auto-commit is set to false

I have several business methods that use the same connection pool to connect to the database. In some business methods, I explicitly create a new transaction, issue some updates and commit that transaction. So far so good.

Some other business methods are read-only. They are read-only because I only issue SELECT queries, not because I've explicitly set READ ONLY mode on the connection (it stays off). My understanding is that under these conditions, Oracle creates a new implicit transaction the first time is encounters a SELECT statement, and keeps that transaction open until I explicitly COMMIT it.

However, given that I only issue SELECT queries, I don't feel that it is necessary to commit the transaction.

I understand that not committing those read-only transactions before returning their connections to the pool means that then next time the connection is borrowed from the pool, the next business method will execute within a transaction that was already started earlier.

My question are:

  • Does it matter?
  • Given the transaction isolation is set to READ_COMMITTED and auto-commit is false, is there any potential extra data race condition that wouldn't have existed if I committed my read-only transaction?
  • Does not committing the transaction incur performance costs on the DB server side (the transaction could potentially stay uncommitted for hours)?
  • Do the answers to the above questions change if we are connecting to RAC instead of a single Oracle DB instance?
  • What about if I'm selecting from a DB Link?

From all the resources I've read, my conclusion so far is that what I'm doing is technically not correct, but that as long as I don't pick a higher transaction isolation level it also doesn't matter. READ_COMMITTED will read whatever was most recently committed by other transactions regardless of whether or not the current transaction is committed or not, and the overhead on the oracle server for keeping track of a transaction that has no pending modifications is trivial. So fixing this issue falls into the "should be fixed, but not an emergency" category.

Does that make sense or did I misunderstand something important?

LordOfThePigs
  • 11,050
  • 7
  • 45
  • 69
  • I've looked into the code of a bunch of them, and none of them do it by default. Some have extra settings that allow you to configure an auto-commit or auto-rollback. – LordOfThePigs Feb 18 '21 at 15:24
  • This looks related: https://stackoverflow.com/q/25886466/217324 – Nathan Hughes Feb 18 '21 at 15:30
  • Thanks for the link. It is related in that it gives clues on how to set up the pool to do the commit on its own. However, it doesn't answer the questions I asked. I think the answers to those questions are interesting, and I couldn't find any other resource that answered them definitevely. Also, tomcat-pool can be configured to commit transactions on return, but it will unconditionally commit. Which means that in the case where I have already commited the TX, i will incur an extra, un-necessary DB roundtrip to commit an empty transaction. – LordOfThePigs Feb 18 '21 at 15:35
  • Yes it would be good to have a definitive answer, I upvoted. I would be worried about filling up the transaction log. – Nathan Hughes Feb 18 '21 at 15:47
  • 1
    Why do you think a transaction is created for a `select`? Is the driver doing its own `set transaction` call or something? Otherwise just querying data shouldn't start a transaction. – Alex Poole Feb 18 '21 at 15:52
  • I would be worried about lax transaction handling and the ensuing quality and maintenance issues. I don't see why you should try to micro-optimize by not committing read transactions, when databases are designed to handle thousands of transactions per second. The effects may very well be pool and configuration dependent, so better stick with the standard idioms, right? Then you won't be surprised when a `SELECT` query gets modified into something that takes locks, and causes problems. – Kayaman Feb 18 '21 at 15:52
  • @Kayaman I also worry about lax transaction handling, that's why I'm asking the question. The issue here is I've just written some automated runtime checks to discover wrong DB usage patterns, and something like 80% of the 500 applications in my organization reported this particular issue. I understand this is wrong, but what I really need to know now, is how critical this is. So far, I'm leaning towards: it's ugly but harmless. – LordOfThePigs Feb 18 '21 at 15:57
  • @AlexPoole from what I understand of the behavior of Oracle, nothing an happen outside of a transaction. In the Oracle dialect, any DML statement starts a new transaction if one doesn't exist. That includes `SELECT` queries. – LordOfThePigs Feb 18 '21 at 15:59
  • According to this: https://docs.oracle.com/database/121/ZZPRE/ch_seven.htm#ZZPRE737 transactions are terminated only on `COMMIT`, `ROLLBACK` or schema modifications. So to me, this implies that a read TX is only closed if it either committed or rolled back. – LordOfThePigs Feb 18 '21 at 16:05
  • 1
    Yes, an explicitly started transaction would not be implicitly closed. I doubt you'll run into critical problems, but it's still an annoying uncertainty vector. I would be more worried about future programmer mistakes than the DB not being able to handle uncommitted connections. Perhaps you should consider something to make the tx handling a bit easier on the developers? Manual transaction handling is cumbersome and prone to errors. – Kayaman Feb 18 '21 at 16:15
  • The wording in the [docs](https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/transactions.html#GUID-E3FB3DC3-3317-4589-BADD-D89A3547F87D) suggests *any* DML, but earlier refers to "constitute an atomic change", which a query doesn't. If you run the query shown in that section without doing an update it doesn't return anything, indicating there is no transaction from the query. Unless you explicitly start one, of course. Also, [see this](https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:4463542200346232696). – Alex Poole Feb 18 '21 at 16:15
  • @AlexPoole the link you sent seems to confirm my suspicion, except it also says "But still, we have to commit or rollback if we select from a table using a database link". I know for sure that there are cases where we are reading from a database link (in Oracle that's just a synonym in the local schema that points to an object in a different database instance) – LordOfThePigs Feb 18 '21 at 16:26
  • @LordOfThePigs - yes, indeed you do; but your question didn't suggest you had distributed transactions. Those should be ended. For local queries there isn't a transaction to end, unless you or the driver are creating one. None of this is helping you with the definitive answer you're looking for of course... – Alex Poole Feb 18 '21 at 16:30
  • @AlexPoole it *is* the driver starting the transaction. Although it might be possible (at least in theory) to write a driver that would not start unnecessary transactions, it would require that the DB wire protocol tell the driver whether a TX was started after performing a query. I don't know if the Oracle protocol would allow that or if the Oracle driver were written to take advantage of it (and based on the behaviour seen it doesn't seem to be). – Kayaman Feb 22 '21 at 11:47

1 Answers1

3

Like Alex Poole commented, it's true that the database engine does not necessarily need to create a transaction for certain cases of read operations for correct operation. I'm not familiar with Oracle internals, but I'll trust Alex that running this query after a select query will show that no actual TX was created.

However, the database engine is the lowest level and knows everything. On top of that we have DB client programs, one of which is the JDBC driver. The JDBC has its own standard, which includes things like setAutocommit(), a thing that databases don't know about, and which is used for transaction handling in JDBC code. Note that setAutocommit(false) is an abstraction, and it does not mean "start transaction", it just means that operations will be grouped in a transaction in some way.

This is where the JDBC vs. DB engine differences start. I'm basing this on PostgreSQL driver code (as it is more readily available), but I expect the Oracle driver to behave in similar way.

Since the driver cannot easily know whether a TX is needed or not, it cannot do heavy optimization like the actual DB engine. The PostgreSQL driver has a SUPPRESS_BEGIN flag that can be used to indicate that a transaction isn't needed even if one would be started otherwise, but based on a quick look it's used only when fetching metadata, an operation known to be safe from transaction issues. Otherwise the driver will issue a BEGIN to start a transaction even for a SELECT, when autocommit=false.

So even if the DB engine doesn't need a TX, the driver is an abstraction on top of the DB, working with JDBC specs. Since you've observed open transactions, it's safe to assume that the Oracle driver also creates explicit transactions for operations where the Oracle DB engine would not create implicit transactions.

So, does it matter?

As discussed in the comments, probably won't make a difference when you have some simple selects to local tables as the transactions aren't holding up significant resources. But throw in DB Links, changes to queries over time, and there's a non-zero amount of risk.

Kayaman
  • 72,141
  • 5
  • 83
  • 121