Why would a sub query perform better than a litteral value in a WHERE clause with multiple joins?

Question

Take to following query:

SELECT *
FROM FactALSAppSnapshot AS LB 
     LEFT OUTER JOIN MLALSInfoStage AS LA ON LB.ProcessDate = LA.ProcessDate AND 
                                                   LB.ALSAppID = LA.ALSNumber
     LEFT OUTER JOIN MLMonthlyIncomeStage AS LC ON LB.ProcessDate = LC.ProcessDate AND
                                                        LB.ALSAppID = LC.ALSNumber
     LEFT OUTER JOIN DimBranchCategory AS LI on LB.ALSAppBranchKey = LI.Branch
WHERE LB.ProcessDate=(SELECT TOP 1 LatestProcessDateKey
                      FROM DimDate)

Notice that the WHERE condition is a scalar sub query. The runtime for this is 0:54 resulting in 367,853 records.

However, if I switch the WHERE clause to the following:

WHERE LB.ProcessDate=20161116

This somehow causes the query runtime to jump up to 57:33 still resulting in 367,853 records. What is happening behind the scenes that would cause this huge jump in the runtime? I would have expected the sub query version to take longer, not the literal integer value.

The table aliased as LI (last join on the list) seems to be the only table that isn't indexed on its key, and seems to allow the query to perform closer to the 1st query if I remove that table as join and using the integer value instead of the sub query.

SQL Server 11

That is odd. What does `SELECT TOP 1 LatestProcessDateKey FROM DimDate` return? What is the type of `LB.ProcessDate` and is it indexed? What does `EXPLAIN` say? And is this an actively used production database? If so, it could be you were just waiting for a lock. — Schwern, Nov 16 '16 at 23:35
What if you use `Declare @d date = convert(date, 20161116); ... WHERE LB.ProcessDate=@d` — artm, Nov 16 '16 at 23:40
@Schwern That little sub query returns the value I have plugged in as an integer (today as of 11/16/2016). `ProcessDate` is a YYYYMMDD date representation as an `int`. I'll have to grab `EXPLAIN` msgs in the morning. It is technically being used as a production server, just being used at limited capacity. `ProcessDate` is apart of a combo key, so I am assuming it's indexed. — Chad Harrison, Nov 17 '16 at 02:11

score 0 · Answer 1 · answered Nov 17 '16 at 00:04

The real answer to your question lies in the execution plan for the query. You can see the actual plan in SSMS.

Without the plan, the rest is speculation. However, based on my experience, what changes is the way the joins are processed. In my experience, queries slow down considerably when a query switches to nested loop joins. This is at the whim of the optimizer, which -- when there is a constant -- thinks this is the best way to run the query.

I'm not sure why this would be the case. Perhaps an index on FactALSAppSnapshot(ProcessDate, ALSAppID, ALSAppBranchKey) would speed up both versions of the query.

Why would a sub query perform better than a litteral value in a WHERE clause with multiple joins?

1 Answers1