SQL - improve NOT EXISTS query performance

Question

Is there a way I can improve this kind of SQL query performance:

INSERT
INTO ...
WHERE NOT EXISTS(Validation...)

The problem is when I have many data in my table (like million of rows), the execution of the WHERE NOT EXISTS clause if very slow. I have to do this verification because I can't insert duplicated data.

I use SQLServer 2005

thx

Yes but they are not formatted the same way. For example in table A the date is an integer 20070102 and in table B the table is a datetime — Melursus, Feb 16 '09 at 21:34
usually if you can do NOT EXISTS then there is likely a way to swap to using EXISTS ... may need an extra temp table but it's possible to be quicker — whytheq, Aug 15 '19 at 13:06

score 14 · Accepted Answer · answered Feb 16 '09 at 20:50

14

Make sure you are searching on indexed columns, with no manipulation of the data within those columns (like substring etc.)

answered Feb 16 '09 at 20:50

cjk

45,739
9
81
112

score 13 · Answer 2 · answered Feb 16 '09 at 20:58

13

Off the top of my head, you could try something like:

 TRUNCATE temptable
 INSERT INTO temptable ...
 INSERT INTO temptable ... 
 ...
 INSERT INTO realtable
 SELECT temptable.* FROM temptable
 LEFT JOIN realtable on realtable.key = temptable.key
 WHERE realtable.key is null

answered Feb 16 '09 at 20:58

Blorgbeard

101,031
48
228
272

2

In my tests in a table with millions of rows this has been the fastest solution by far... OP's option lasts several minutes, accepted (ck's) answer lasts several minutes, your solution lasts 10 seconds (and I'm using the correct indexes in all solutions) – David Espart Aug 02 '11 at 16:54

score 6 · Answer 3 · answered Feb 16 '09 at 20:56

6

Try to replace the NOT EXISTS with a left outer join, it sometimes performs better in large data sets.

answered Feb 16 '09 at 20:56

Otávio Décio

73,752
17
161
228

Funny, I've more often found the opposite. EXISTS will stop searching at the first match found, whereas a join makes all possible matches. Thus EXISTS ought to be faster. I think. – sfuqua Feb 16 '09 at 23:20
The thing is, NOT EXISTS will always cause a table scan whereas if you are careful with your join you might be working solely with indexes. – Otávio Décio Feb 16 '09 at 23:31
2

ávio, I don't think it's true that NOT EXISTS will always cause a table scan. See this post: http://social.msdn.microsoft.com/Forums/en-US/transactsql/thread/582544fb-beda-46c0-befd-4b28b5c2cdee/ – John M Gant Nov 11 '10 at 14:26

score 2 · Answer 4 · answered Sep 30 '19 at 20:29

Outer Apply tends to work for me...

instead of:

from t1
where not exists (select 1 from t2 where t1.something=t2.something)

I'll use:

from t1
outer apply (
    select top 1 1 as found from t2 where t1.something=t2.something
) t2f
where t2f.found is null

score 1 · Answer 5 · answered Feb 16 '09 at 20:56

Pay attention to the other answer regarding indexing. NOT EXISTS is typically quite fast if you have good indexes.

But I have had performance issues with statements like you describe. One method I've used to get around that is to use a temp table for the candidate values, perform a DELETE FROM ... WHERE EXISTS (...), and then blindly INSERT the remainder. Inside a transaction, of course, to avoid race conditions. Splitting up the queries sometimes allows the optimizer to do its job without getting confused.

score 0 · Answer 6 · answered Feb 16 '09 at 20:57

If you can at all reduce your problem space, then you'll gain heaps of performance. Are you absolutely sure that every one of those rows in that table needs to be checked?

The other thing you might want to try is a DELETE InsertTable FROM InsertTable INNER JOIN ExistingTable ON <Validation criteria> before your insert. However, your mileage may vary

score 0 · Answer 7 · edited Sep 15 '17 at 19:00

0

insert into customers 
select * 
from newcustomers 
where customerid not in (select customerid 
                         from customers)

..may be more efficient. As others have said, make sure you've got indexes on any lookup fields.

edited Sep 15 '17 at 19:00

C B

1,677
6
18
20

answered Feb 16 '09 at 21:09

SqlACID

4,024
20
28

SQL - improve NOT EXISTS query performance

7 Answers7

Linked