2

I am using Postgres 9.3, Spring and Java.

For BatchUpdateException from javadoc:

After a command in a batch update fails to execute properly and a BatchUpdateException is thrown, the driver may or may not continue to process the remaining commands in the batch. If the driver continues processing after a failure, the array returned by the method BatchUpdateException.getUpdateCounts will have an element for every command in the batch rather than only elements for the commands that executed successfully before the error. In the case where the driver continues processing commands, the array element for any command that failed is Statement.EXECUTE_FAILED.

The I register following translator in jdbcTemplate

public class DuplicateRecordSQLErrorCodesTranslator extends SQLErrorCodeSQLExceptionTranslator {

    @Override
    protected DataAccessException doTranslate(String task, String sql, SQLException ex) {
        String errorCode = ex.getErrorCode() == 0 ? ex.getSQLState() : String.valueOf(ex.getErrorCode());

        if (ex instanceof BatchUpdateException) {
            int[] updatesCount = ((BatchUpdateException) ex).getUpdateCounts();

Then in scenario I am trying to insert a batch to DB of 100 entries, but 50 of them are duplicated. Eventually I am not able to retrieve all duplicated entries in the first transaction because updateCounts returned is always 1.

Djizeus
  • 4,161
  • 1
  • 24
  • 42
Nikolay Kuznetsov
  • 9,467
  • 12
  • 55
  • 101
  • So what's the problem? The javadoc says the batch might stop after the first insert if the second is in error (e.g. because of duplicate key). – piet.t Jul 16 '14 at 09:12
  • The problem is how to implement batch insert effectively. – Nikolay Kuznetsov Jul 16 '14 at 09:32
  • I can't see your actual SQL commands in the question. – Erwin Brandstetter Jul 24 '14 at 00:45
  • Think about generic batch of INSERT, each INSERT could fail because of some restriction, i.e. `INSERT INTO films VALUES('UA502', 'Bananas', 105, '1971-07-13', 'Comedy', '82 minutes');` – Nikolay Kuznetsov Jul 24 '14 at 19:16
  • 1
    copy them in to a temp table, then insert the ones that aren't duplicate by doing a join. like this: http://stackoverflow.com/questions/4069718/postgres-insert-if-does-not-exist-already – Greg Jul 27 '14 at 00:42
  • Post the code that fails, please? How you "translate" the error is not that relevant. Also, you mean that .updateCounts() returns an array of length 1, and not the number 1, right? – Xabster Jul 30 '14 at 19:56

2 Answers2

1

It's not clear exactly what you want. The PostgreSQL Documentation shows an example of how to write a PL/pgSQL function that does an update or insert depending on whether or not the primary key already exists.

CREATE TABLE db (a INT PRIMARY KEY, b TEXT);

CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS
$$
BEGIN
    LOOP
        -- first try to update the key
        UPDATE db SET b = data WHERE a = key;
        IF found THEN
            RETURN;
        END IF;
        -- not there, so try to insert the key
        -- if someone else inserts the same key concurrently,
        -- we could get a unique-key failure
        BEGIN
            INSERT INTO db(a,b) VALUES (key, data);
            RETURN;
        EXCEPTION WHEN unique_violation THEN
            -- Do nothing, and loop to try the UPDATE again.
        END;
    END LOOP;
END;
$$
LANGUAGE plpgsql;

SELECT merge_db(1, 'david');
SELECT merge_db(1, 'dennis');

So in your case you can use this function or modify it to suit your needs, then make batches of function calls. This is the most efficient way I can think of to do what you want in PostgreSQL. (Other databases have SQL extensions that handle this more directly.)

If instead you wanted the insert to fail on duplicate keys and just wanted to know which succeeded and which failed, you could rewrite the function to return a success or failure boolean or maybe null on success and the key ID on failure.

Old Pro
  • 24,624
  • 7
  • 58
  • 106
0

If you cannot take the approach suggested by OldPro, try this:

Create a List or a Set where POJO is having only the Key columns from the table. Now create a similar list for all the POJO you were going to update/insert in DB.

Loop over the set of POJO in the DB, then discard any POJO from your to_insert_or_update list which is already present in the DB.

This way you Batch update will go fairly smooth, unless there are other apps doing insert/update on the same set of records.

NRJ
  • 1,064
  • 3
  • 15
  • 32