0

I have a simple query and it works fast (<1sec):

;WITH JointIncomingData AS
(
    SELECT A, B, C, D FROM dbo.table1
    UNION ALL
    SELECT A, B, C, D FROM dbo.table2
)
SELECT * 
FROM JointIncomingData D 
WHERE a = '1/1/2020'

However, if I join with another small table in the final SELECT statement it is extremely slow (> 30 sec)

DECLARE @anotherTable TABLE (A DATE, B INT) 

INSERT INTO @anotherTable (AsOfDate, FundId) 
VALUES ('1/1/2020', 1)

;WITH JointIncomingData AS
(
    SELECT A, B, C, D FROM dbo.table1
    UNION ALL
    SELECT A, B, C, D FROM dbo.table2
)
SELECT * 
FROM JointIncomingData D 
JOIN @anotherTable T ON T.A = D.A AND T.B = D.B

In the real application, I have a complex UPDATE as the final statement, so I try to avoid copy-paste and introduces UNION to consolidate code.

But now experience an unexpected issue with slowness.

I tried using UNION ALL instead of UNION - with the same result.

Looks like SQL Server pushed simple conditions to each of UNION statements, but when I join it with another table, it doesn't happen and a table scan occurs.

Any advice?

UPDATE: Here is estimated plans for the first simple condition query: https://www.brentozar.com/pastetheplan/?id=SJ5fynTgP

for the query with a join table: https://www.brentozar.com/pastetheplan/?id=H1eny3pxP

Please keep in mind that estimated plans are not exactly for the above query, but more real one, having exactly the same problem.

tgralex
  • 794
  • 4
  • 14
  • What does the query plan look like? What part of the query plan takes up the most resources / time? When comparing the query plan of the "fast" query to the "slow" query what are the big differences? – Igor Jul 28 '20 at 13:47
  • 2
    Please [share your execution plan](https://www.brentozar.com/pastetheplan/) Without the plan, we cannot help you. One issue might be the CTE JointIncomingData. Have you tried converting the CTE to a temp table? – Preben Huybrechts Jul 28 '20 at 13:49
  • 1
    Like @Igor mention, can you share your execution plan? https://www.brentozar.com/pastetheplan/ – Pete -S- Jul 28 '20 at 13:49
  • If the update is slow, it might be a locking issue. You could investigate with [sp_whoisactive](https://github.com/amachanic/sp_whoisactive) – Preben Huybrechts Jul 28 '20 at 13:56
  • Guys, I replaced UPDATE with SELECT to find out the problem. I gave a sample query, its not actual and I don't know how can I share my execution plan from SSMS – tgralex Jul 28 '20 at 13:57
  • Have you pushed down your filters into the individual parts of the union? – shawnt00 Jul 28 '20 at 13:58
  • 2
    Fortunately the page linked to (twice now) has a handy-dandy [tab with instructions](https://www.brentozar.com/pastetheplan/instructions/). Proper performance analysis with what amounts to example queries is fraught with peril, since there can be a multitude of things influencing the plan other than the simple text can reveal. – Jeroen Mostert Jul 28 '20 at 13:59
  • Updated post with estimated plans. Just keep in mind its not an actual query, but alike one – tgralex Jul 28 '20 at 14:02
  • 1
    This sounds like parameter sniffing. When you hard code the date in the 1st query, the plan created is the optimal plan for this date, and this date alone. When you join to a table variable, then the plan created is a plan that will work moderately well for any date, but some dates, depending on cardinality will be much worse. You should be able to use `OPTION(RECOMPILE)` on your second query, which will force the optimiser to re-evaluate the query plan with each execution based on the contents of `@anotherTable` – GarethD Jul 28 '20 at 14:28
  • Could be, I always need to keep it in mind... – tgralex Jul 28 '20 at 14:53

2 Answers2

1

Based on the "similar" execution plan you shared. It would also be better to have the actual plan, to know if your estimates and memory grants are ok.

Key lookup

The index IX_dperf_date_fund should be extended to INCLUDE the following columns nav, equity

Why? Every row the index returns, create a lookup in the clusterd index to retrieve the column values of nav, equity.

Only if this is reasonable for the application, if other queries may benefit as well

CTE

Change your CTE to a temp table.

Example:

SELECT *
INTO #JointIncomingData 
FROM (
    SELECT AsOfDate, FundId, DataSourceId, ShareClass, NetAssetsBase, SharesOutstanding 
    FROM 
    ETL.tblIncomingData
    UNION ALL
    SELECT AsOfDate, FundId, DataSourceId, ShareClass, NetAssetsBase, SharesOutstanding 
    FROM ETL.vIncomingDataDPerf
) x

Why? CTE's are not materialized. and this answer

Bonus: parameter sniffing

If you pass in parameters you might be suffering from parameters sniffing.

Preben Huybrechts
  • 5,853
  • 2
  • 27
  • 63
1

When I'm doing complex updates I normally declare a temp table and insert the rows into it that I intend to update. There's two benefits to this approach, one being that by explicitly collecting the rows to be updated you simplify the logic and make the update itself really simple (just update the rows whose primary key is in your temp table). The other big benefit of it is you can do some sanity checking before actually running your update, and "throw an error" by returning a different value. I think it's generally a good practice to break down queries into simple steps like this, because it makes them much easier to troubleshoot in the future.

Chris
  • 53
  • 7