More efficient alternative to subquery

Question

I've got many "actual" and "history companion tables" (so to speak) with structure of last like this:

values|  date_deal | type_deal | num (autoinc)
value1| 01.01.2012 |         i |   1
value1| 02.01.2012 |         u |   2
value2| 02.01.2012 |         i |   3
value2| 03.01.2012 |         u |   4
value1| 04.01.2012 |         d |   5
value2| 05.01.2012 |         u |   6
value2| 08.01.2012 |         u |   7

If I insert (or update or delete) record in "actual" table, trigger puts affected record into "history table" with date_deal = Geddate(), type_deal = i|u|d (for insert, update and delete triggers respectivly) and num as autoinc unique value

So the question is how to get last record for each distinct value valid on certain date and excluding from final result records which type_deal = 'd' (since that record was deleted from actual table by that time and we don't want to have anything assosiated with it)

The way I do it most of the time:

SELECT  *
FROM    t_table1 t1
WHERE   t1.num = (  SELECT  MAX(num)
                    FROM    t_table1 t2
                    WHERE   t2.[values] = t1.[values]
                        AND t2.[date_deal] < @dt)
    AND t1.[type_deal] <> 'D'

But that works very slow sometimes. I'm looking for more efficient alternative. Please, help

So, an update. Thanks for replies, friends.

I've made some testing on both actual and testing servers. In order to put these different approaches into same league I've decided that we should take all fields from source table.

Testing server has bellow 200K records and I also had a luxury of using DBCC FreeProcCache and DBCC DropCleanbuffers directives. Actual working server has over 2.3M records and also no option for droping buffs or cache since.. well.. it is in use by real users. So it was droped only once and i've got results right after that.

Here is actual queries and time it took on both servers:

Original:

DECLARE @dt datetime = CONVERT(datetime, '01.08.2013', 104)

SELECT  *
FROM    [CLIENTS_HISTORY].[dbo].[Clients_all_h] c
WHERE   c.num = (   SELECT  MAX(num)
                    FROM    [CLIENTS_HISTORY].[dbo].[Clients_all_h] c2
                    WHERE   c2.[AccountSys] = c.[AccountSys]
                        AND date_deal <= @dt)
    AND c.type_deal <> 'D'

61sec @ 2'316'890rec on real one, 4sec @ 191'533 on test

Rahul's:

SELECT  *
FROM    [CLIENTS_HISTORY].[dbo].[Clients_all_h] c
GROUP BY [all_fields]
HAVING  c.num = (   SELECT  MAX(num)
                    FROM    [CLIENTS_HISTORY].[dbo].[Clients_all_h] c2
                    WHERE   c2.[AccountSys] = c.[AccountSys]
                        AND date_deal <= @dt)
    AND c.type_deal <> 'D'

62sec @ 2'316'890rec on real one, 4sec @ 191'533 on test Almost equal

George's (with some major changes):

SELECT * FROM 
(
SELECT  *,
        ROW_NUMBER() OVER(PARTITION BY accountsys ORDER BY num desc) AS aa
FROM    [CLIENTS_HISTORY].[dbo].[Clients_all_h] c
WHERE   c.date_deal < @dt) as a
WHERE   aa=1
    AND type_deal <> 'D'

76sec @ 2'316'890rec on real one, 5sec @ 191'533 on test

So far original and Rahul's are fastest and George's is not so fast.

With any performance problem the first thing to do is look at the [execution plan](http://stackoverflow.com/questions/7359702/how-do-i-obtain-a-query-execution-plan). It might not be the subquery which is causing the performane issue (often SQL Server optimizes the subquery away into a join or equivalent) — Justin, Oct 09 '13 at 12:23
Thanks for idea, Justin. It seems like SQL Server was actualy optimizing query into INNER JOIN — user2838376, Oct 10 '13 at 05:27

score 0 · Answer 1 · answered Oct 09 '13 at 12:46

Try using GROUP BY..HAVING CLAUSE

SELECT  *
    FROM    t_table1 t1
    GROUP BY [column_names]
    HAVING   t1.num = (  SELECT  MAX(num)
                        FROM    t_table1 t2
                        WHERE   t2.[values] = t1.[values]
                            AND t2.[date_deal] < @dt)
        AND t1.[type_deal] <> 'D'

score 0 · Answer 2 · answered Oct 09 '13 at 14:34

0

I think row_num() could be usefull for you as follows:

select
   *
from
(
   select
     *,
   row_number() over( partition by date_deal order by num) as aa
      from
   t_table1 t1
   where
     t1.[type_deal] <> 'D'
) as a
where
   aa=1

answered Oct 09 '13 at 14:34

George Mavritsakis

6,829
2
35
42

More efficient alternative to subquery

2 Answers2