Increasing INSERT speed

Question

I'm currently using the following query to insert into a table only if the record does not already exist, presumably this leads to a table scan. It inserts 28000 records in 10 minutes:

INSERT INTO tblExample(column)
(SELECT ? FROM tblExample WHERE column=? HAVING COUNT(*)=0)

If I change the query to the following, I can insert 98000 records in 10 minutes:

INSERT INTO tblExample(column) VALUES (?)

But it will not be checking whether the record already exists.

Could anyone suggest another way of querying such that my insert speed is faster?

Do you have a unique index on the column? That would both speed up the lookup query and enforce the constraint (in case you missed it in your code). — Thilo, Apr 21 '16 at 09:32
Thanks @Thilo. I have taken the primary key off of the table because I have read that having a PK increases the time taken to insert. Are you suggesting that I use a PK and catch the error if a duplicate is found? — jdie8274j, Apr 21 '16 at 09:36
Having the PK does slow down the insert, but if you are not sure that the data is unique, you need that constraint validation. So yes, put it back on. — Thilo, Apr 21 '16 at 09:39
You could try the `derby` equivalent of an `UPSERT`. [Can I use MERGE INTO to simulate “upsert” in Apache Derby?](http://stackoverflow.com/q/34734653/823393) — OldCurmudgeon, Apr 21 '16 at 09:40
Hi @Thilo. I have tried this, but when I do a batch insert, it then throws a BatchUpdateException because it violates the unique or primary key constraint. I can catch the error, but it will not process any more of the records at that point. — jdie8274j, Apr 21 '16 at 12:47

justAbit · Accepted Answer · 2016-04-21T09:33:04.493

1

One simple solution (but not recommended) could be to simply have insert statement, catch duplicate key exception and log them. Assuming that the table has unique key constraint.

edited Apr 21 '16 at 09:33

answered Apr 21 '16 at 09:31

justAbit

4,226
2
19
34

This is what I have gone with and performance has really improved. Thanks! – jdie8274j Apr 21 '16 at 15:46

Bob Salmon · Answer 2 · 2016-04-21T09:45:30.197

Make sure that you have an index on the column[s] you're checking. In general, have a look at the query execution plan that the database is using - this should tell you where the time is going, and so what to do about it.

For Derby db this is how you get a plan and how to read it.

Derby also has a merge command, which can act as insert-if-not-there. I've not used it myself, so you'd need to test it to see if it's faster for your circumstances.

Increasing INSERT speed

2 Answers2