0

Why queries like

delete from A where F1 IN (1,2,3,5,5) and F2 IN (7,9,10,11)

are so slow (F1 and F2 are indexed, stats updated) and how do you optimize them?

user1514042
  • 1,899
  • 7
  • 31
  • 57
  • 1
    Even if F1 and F2 are indexed, consider that having multiple conditions like that is effectively a small-scale cross-join. "ok I've got F1 = 1, is F2 any of .....". The table might be indexed, but your `IN` conditions aren't. – Marc B Aug 02 '13 at 13:59
  • Looks like a small but a painful problem with sql optimizer not being smart enough to handle it. I tried ANDs with a Cartesian product of my two ins and it worked slightly faster, while resulting in some gigantic sql – user1514042 Aug 02 '13 at 14:05

2 Answers2

1

Given your example, I'm not sure there's anything you could do to increase performance.

However, your example is simplistic, and if instead your example were using subqueries in the IN statements, then it would probably have room for improvement, perhaps by using an EXISTS instead or just joining. I think the meat of this question is probably about performance issues with IN statements though, right?

Your best tool when considering performance is to examine the explain plans of different solutions and see which one makes most sense for the amount and types of data you expect.

This SO post explains some about how an IN statement works...

SQL Server IN vs. EXISTS Performance

Here's a blog that also discusses performance factors...

http://sqlknowledgebank.blogspot.com/2012/11/in-exists-clause-and-their-performance.html

Community
  • 1
  • 1
Jim
  • 6,753
  • 12
  • 44
  • 72
0

By guess is a dual loop

My sample is a select
It is a lot faster to optimize a select first
With a join on a PK the query optimizer has more to work with
But with the PK you cannot insert 5 twice

create table #tempF1 (ID int primary key);
insert into #tempF1 values (1),(2),(3),(4);
create table #tempF2 (ID int primary key);
insert into #tempF2 values (1),(2),(3),(5);

select * 
from tbl
inner merge join #tempF1 
   on tbl.F1 = #tempF1.ID
inner merge join #tempF2 
   on tbl.F1 = #tempF2.ID

May not work in your situation and test other join hints and no hint

I use this technique on some big tables with complex queries where the query optimizer got stupid

paparazzo
  • 44,497
  • 23
  • 105
  • 176