Which is more efficent: update with a join, or Update with in statement?

Question

I have a temp table with a result set of IDs. I need to disable all the records from the main table that match the ID values within my result set.

Which of the following statements is more efficient?

UPDATE MyTable
  SET Disabled = 1
WHERE ID in (SELECT ID FROM @TempTable)

OR

UPDATE T
  SET T.Disabled = 1
FROM MyTable as T
JOIN @TempTable as Temp
ON T.ID = Temp.ID

Other minor question: Does Row count of the temp table effect this choice?

update ID is indexed on both MyTable, and @TempTable. I guess my question is "What factors should I think about when deciding between these two different techniques for the same outcome?"

It may be. This question is specifically about the UPDATE statement and how the different ways to filter the update statement affect the operation. — Nathan Tregillus, Mar 02 '12 at 22:34
Nathan, the usage of table variable (@TempTable) vs. temp table (#TempTable) can influence the plan because SQL Server will have stats on a temp table, but not a table variable. As the rowcount in the table variable gets smaller, the less effect I would expect to see on the overall performance. — Phil Helmer, Mar 03 '12 at 03:53

score 2 · Accepted Answer · answered Mar 02 '12 at 23:13

2

The modern versions of the optimizer should collapse these to the exact same plan, making their performance equivalent enough to be deemed identical. (You forgot to tell us which version.) But there is no way I would ever give you a blanket answer that this will always be true. There are just too many variables that make it ok to make assumptions about this without checking the plans in each scenario. IMHO.

answered Mar 02 '12 at 23:13

Aaron Bertrand

272,866
37
466
490

Thanks for braving the waters with an answer. I have been searching the web, and I am pretty much reading that "the engine SHOULD always build the same plan" I guess it all comes down to the execution plan per scenario. – Nathan Tregillus Mar 02 '12 at 23:21

cairnz · Answer 2 · 2012-03-03T16:41:57.413

0

I played with this a little bit on my laptop, and both IN, JOIN and EXISTS proved to give identical plans (in my example, trying to emulate your case).

However there was a very minute difference in compile time (and compile memory) for these plans. JOIN was fastest, followed by IN, then by EXISTS (1, 4 and 7 ms respectively, repeatedly tested, retested and in different orders).

That means if you are recompiling this query every time, you could possibly save some ms per execution by writing it as a JOIN query instead of EXISTS.

edited Mar 03 '12 at 16:41

answered Mar 03 '12 at 00:04

cairnz

3,917
1
18
21

But keep in mind the compile time can be influenced by many factors. Your test may not hold true for different data, different indexes, different stats, different number of procs, number of rows, etc etc ad nauseum... still not appropriate to suggest this savings will always exist outside of your controls test. You're more than welcome to believe it but I would caution others that the test is not conclusive for all scenarios. – Aaron Bertrand Mar 03 '12 at 00:28
I made sure that "this query" is now emphasised. It already was it in the post, but maybe it didn't come out very clear. It was anyways not meant as a proper answer but some side information :) – cairnz Mar 03 '12 at 09:44
But even the behavior of *this query* could change tomorrow or next week, since the optimizer doesn't base all of its decisions on the query text alone, and the system is not always in the exact same state. – Aaron Bertrand Mar 03 '12 at 14:11

Which is more efficent: update with a join, or Update with in statement?

2 Answers2