3

Forgive me if this is a repeat and/or obvious question, but I can't find a satisfactory answer either on stackoverflow or elsewhere online.

Using Microsoft SQL Server, I have a nested select query that looks like this:

select * 
into FinalTable 
from 
    (select * from RawTable1 join RawTable2)
    join
    (select * from RawTable3 join RawTable4)

Instead of using nested selects, the query can be written using temporary tables, like this:

select * 
into Temp1 
from RawTable1 join RawTable2

select * 
into Temp2 
from RawTable3 join RawTable4

select * 
into FinalTable 
from Temp1 join Temp2 

Although equivalent, the second (non-nested) query runs several order of magnitude faster than the first (nested) query. This is true both on my development server and a client's server. Why?

Will D
  • 31
  • 1
  • 4
    your answer is here http://stackoverflow.com/questions/16767645/why-is-there-a-huge-performance-difference-between-temp-table-and-subselect – radar Oct 30 '14 at 19:38

1 Answers1

0

The database engine is holds subqueries in requisite memory at execution time, since they are virtual and not physical, the optimiser can't select the best route, or at least not until a sort in the plan. Also this means the optimiser will be doing multiple full table scans on each operation rather than a possible index seek on a temporary table.

Consider each subquery to be a juggling ball. The more subqueries you give the db engine, the more things it's juggling at one time. If you simplify this in batches of code with a temp table, the optimiser finds a clear route, in most cases regardless of indexes too, at least for more recent versions of SQL Server.

John Bell
  • 2,350
  • 1
  • 14
  • 23