2

In PostgreSQL, if you encounter an error in transaction (for example when your insert statement violates unique constraint), the whole transaction is aborted, you cannot commit it and no rows are inserted:

database=# begin;
BEGIN
database=# insert into table (id, something) values ('1','whatever');
INSERT 0 1
database=# insert into table (id, something) values ('1','whatever');
ERROR:  duplicate key value violates unique constraint "table_id_key"
Key (id)=(1) already exists.
database=# insert into table (id, something) values ('2','whatever');
ERROR:  current transaction is aborted, commands ignored until end of transaction block
database=# rollback;
database=# select * from table;
id   | something  | 
-----+------------+

(0 rows)

You can change that by setting ON_ERROR_ROLLBACK to "on" or "interactive", after that you can do multiple inserts ignoring errors, commit and have only successfully inserted rows in table after transaction end.

database=# \set ON_ERROR_ROLLBACK interactive

In Oracle, this is the default transaction management behaviour, which surprises me. Isn't this completely counterintuitive and dangerous?

When I start a transaction I want to be sure that all the statements were successfull. What if my multiple inserts comprise some kind of an object or data structure? I end up completely unaware of the data state in my database and should be checking it after the commit. If one of the inserts fails I want to be sure that other inserts will be rollbacked or not even evaluated after the first error, which is exactly how it's done in PostgreSQL.

Why does Oracle have such way of transaction management as a default, and why is it considered good practice?

For example, some random guy here in comments

This is a very neat feature.

I don't understand this, though: "Normally, any error you make will throw an exception and cause your current transaction to be marked as aborted. This is sane and expected behavior..."

No, it's really not. Oracle doesn't work this way, nor does MySQL. I have no experience with MSSQL or DB2 but I'll bet a dollar each they don't work this way either. There no intuitive reason why a syntax error, or any other error for that matter, should abort a transaction. I can only assume there's either some limitation deep in the Postgres guts that requires this behavior, or that it conforms to some obscure part of the SQL standard that everyone else sensibly ignores. There's certainly no API / UX reason why it should work this way.

We really shouldn't be too proud of any workarounds we've developed for this pathological behavior. It's like IT Stockholm Syndrome.

Does not it violate even the definition of the transaction?

Transactions provide an "all-or-nothing" proposition, stating that each work-unit performed in a database must either complete in its entirety or have no effect whatsoever.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Snifff
  • 1,784
  • 2
  • 16
  • 28

1 Answers1

4

I agree with you. I think it's a mistake not to abort the whole tx. But people are used to that, so they think it's reasonable and correct. Like people who use MySQL think that the DBMS should accept 0000-00-00 as a date, or people using Oracle expect that '' IS NULL.

The idea that there's a clear distinction between a syntax error and something else is flawed.

If I write

BEGIN;

CREATE TABLE new_customers (...);

INSET INTO new_customers (...)
SELECT ... FROM customers;

DROP TABLE customers;

COMMIT;

I don't care that it's a typo resulting in a syntax error that caused me to lose my data. I care that the transaction didn't successfully execute all its statements but still committed.

It'd be technically feasible to allow soft rollback in PostgreSQL before any rows are actually written by a statement - probably before we even enter the executor. So failures in the parse and parameter binding phases could allow the tx not to be aborted. We have a statement memory context we could use to clean up.

However, once the statement starts changing rows, it's doing so on disk with the same transaction ID as the prior statements in the tx. So you can't roll it back without rolling back the whole tx. To allow statement rollback Pg needs to assign a new subtransaction ID. That costs resources. You can do it explicitly with SAVEPOINTs when you want to, and internally that's what psql is doing. In theory we could allow the server to do this implicitly for each statement to implement statement rollback, just at a performance cost. But I doubt any patch implementing this would get committed, at least not without a LOT of argument, because most of the PostgreSQL team are (IMO reasonably) not fond of "whoops, that broke but we'll continue anyway" transaction semantics.

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • good point! Maybe there are some arguments to do it one way or another. It's just I don't see any except "convenience" in Oracle way. – Snifff Aug 12 '16 at 09:02
  • _people using Oracle expect that NULL = NULL is true not NULL_ Are you sure? – Konstantin Sorokin Aug 12 '16 at 15:05
  • @KonstantinSorokin Nope, I was wrong. What I was thinking of is that in Oracle 9i, `'' IS NULL`. http://stackoverflow.com/q/13278773/398670 . I don't remember what DBMS it was where `NULL = NULL` was true ... Old MS SQL? Microsoft Access / JET engine? Old MYSQL? – Craig Ringer Aug 13 '16 at 01:15