2

frequently I encounter a situation like this, where I need to join a big table to a certain transformation of a table.

I have made an example with a big table and a smaller prices table.

Enter the table CarPrices, which has prices per car brand/model with starting and ending dates. I want to join all sold cars to the sales price in the CarPrices table, on the criterium SaleDate BETWEEN PriceStartingDate and PriceEndingDate, but if there is no price for the period, I want to join to the newest price that is found.

I can accomplish it like this but it is terribly slow:

WITH CarPricesTransformation AS (
    SELECT CarBrand, CarModel, PriceStartingDate,
        CASE WHEN row_number() OVER (PARTITION BY CarBrand, CarModel, 
            ORDER BY PriceStartingDate DESC) = 1
            THEN NULL ELSE PriceEndingDate END PriceEndingDate,
        Price
    FROM CarPrices
)
SELECT SUM(Price)
FROM LargeCarDataBase C
INNER JOIN CarPricesTransformation P
ON C.CarBrand = P.CarBrand
AND C.CarModel = P.CarModel
AND C.SaleDate >= P.PriceStartingDate
AND (C.SaleDate <= P.PriceEndingDate OR P.PriceEndingDate IS NULL)

A reliable way to do it quicker is to forget about making a VIEW and creating a stored procedure instead, where I first prepare the smaller prices table as a temporary table with the correct clustered index, and then make the join to that. This is much faster. But I would like to stick with a view.

Any thoughts...?

Adam Wenger
  • 17,100
  • 6
  • 52
  • 63
littlegreen
  • 7,290
  • 9
  • 45
  • 51
  • 1
    See [Quassnoi's answer here](http://stackoverflow.com/questions/1531835/sql-2005-cte-vs-temp-table-performance-when-used-in-joins-of-other-tables/1531994#1531994) If you want the intermediate result materialized and indexed you are much better off doing this explicitly using `#temp` tables although the link in the answer gives an example of doing this by using a plan guide. – Martin Smith Feb 18 '11 at 17:05
  • I'm confused, what is the name of the view you are using? It seems like you might be using CTE/View interchangeably. This is a habit you should break as they are not the same thing. – Abe Miessler Feb 18 '11 at 17:05
  • What does your execution plan say? This could potentially be solved by adding/modifying the indexes you have on your tables... – Abe Miessler Feb 18 '11 at 17:06
  • @Abe. I take it that the code in the question is the View definition. `CREATE VIEW foo AS ` – Martin Smith Feb 18 '11 at 17:17
  • Have you got some metrics to work with? # of Records in LargeCarDataBase? # of records in CarPricesTransformation? AVG prices per car etc? – RichardTheKiwi Feb 21 '11 at 03:04
  • Some table structures would help too. It looks like you need an index on [CarBrand, CarModel, PriceStartingDate desc]. – RichardTheKiwi Feb 21 '11 at 03:08
  • @Martin - Good find with the previous question - I added an answer there. Although (1) there could be other ways to tackle CTEs, and (2) I think the crux of this question is the the last 30 characters of the query, but awaiting more info. – RichardTheKiwi Feb 21 '11 at 03:13
  • @Abe. Yes, the view definition is the whole code I posted, like Martin already guessed. If I switch to using #temp tables, I need to switch to stored procedures but lose the flexibility of a view. – littlegreen Feb 21 '11 at 09:32
  • @Richard: I am just talking about the general case. I cannot post the specific code. You can assume that the tables are really large, the transformation is much more complex, every standard index is there but the query is still running slow. Eg. a worst-case scenario. – littlegreen Feb 21 '11 at 09:35
  • @All: sorry for late response. I didn't look at my PC over the weekend. – littlegreen Feb 21 '11 at 09:35
  • @littlegreen Open ending a question is the least likely way to get a useful reply. The part you left out could very well be the most crucial piece of the puzzle, without which all answers will be futile. – RichardTheKiwi Feb 21 '11 at 10:45
  • @Richard: No further comment. – littlegreen Feb 22 '11 at 10:44

2 Answers2

0

You can't make a "smaller prices table" since the price depends on the sale date. Also, why the CTE in the first place?

Select
  Sum(Coalesce(ActivePrice.Price, LatestPrice.Price))
From
  LargeCarDataBase As Sales
  Left Outer Join CarPrices As ActivePrice
    On Sales.CarBrand = ActivePrice.CarBrand
    And Sales.CarModel = ActivePrice.CarModel
    And (((Sales.SaleDate >= ActivePrice.PriceStartingDate)
          And ((Sales.SaleDate <= ActivePrice.PriceEndingDate)
               Or (ActivePrice.PriceEndingDate Is Null)))
  Left Outer Join CarPrices As LatestPrice
    On Sales.CarBrand = LatestPrice.CarBrand
    And Sales.CarModel = LatestPrice.CarModel
    And LatestPrice.PriceEndingDate Is Null
Stu
  • 15,675
  • 4
  • 43
  • 74
0

Have you tried Indexed Views?

The results from Indexed Views are automatically commited to disk so you can retrieve them super-fast.

    CREATE VIEW [dbo].[SuperFastCarPrices] WITH SCHEMABINDING AS
    SELECT  C.CarBrand,
            C.CarModel,
            C.SaleDate,
            SUM(P.Price) AS Price
    FROM CarPrices P
    INNER JOIN LargeCarDataBase C
        ON C.CarBrand = P.CarBrand
        AND C.CarModel = P.CarModel
        AND C.SaleDate >= P.PriceStartingDate
        AND (P.PriceEndingDate IS NULL OR C.SaleDate <= P.PriceEndingDate)
    GROUP BY C.CarBrand, C.CarModel, C.SaleDate

    CREATE UNIQUE CLUSTERED INDEX [IDX_SuperFastCarPrices] 
    ON [dbo].[SuperFastCarPrices](CarBrand, CarModel, SaleDate)

You can then select directly from this view, which will return records at the same speed as selecting from a table.

There is the downside that indexed views slow down changes to the underlying tables. If you are worried about the cost of inserting records into the table LargeCarDataBase after this view has been created, you can create an index on columns CarBrand, CarModel and SaleDate which should speed up insertion and update on this table.

For more on Indexed Views see Microsoft's Article.

Steven de Salas
  • 20,944
  • 9
  • 74
  • 82