I've got many "actual" and "history companion tables" (so to speak) with structure of last like this:
values| date_deal | type_deal | num (autoinc)
value1| 01.01.2012 | i | 1
value1| 02.01.2012 | u | 2
value2| 02.01.2012 | i | 3
value2| 03.01.2012 | u | 4
value1| 04.01.2012 | d | 5
value2| 05.01.2012 | u | 6
value2| 08.01.2012 | u | 7
If I insert (or update or delete) record in "actual" table, trigger puts affected record into "history table" with date_deal = Geddate(), type_deal = i|u|d (for insert, update and delete triggers respectivly) and num as autoinc unique value
So the question is how to get last record for each distinct value valid on certain date and excluding from final result records which type_deal = 'd' (since that record was deleted from actual table by that time and we don't want to have anything assosiated with it)
The way I do it most of the time:
SELECT *
FROM t_table1 t1
WHERE t1.num = ( SELECT MAX(num)
FROM t_table1 t2
WHERE t2.[values] = t1.[values]
AND t2.[date_deal] < @dt)
AND t1.[type_deal] <> 'D'
But that works very slow sometimes. I'm looking for more efficient alternative. Please, help
So, an update. Thanks for replies, friends.
I've made some testing on both actual and testing servers. In order to put these different approaches into same league I've decided that we should take all fields from source table.
Testing server has bellow 200K records and I also had a luxury of using DBCC FreeProcCache and DBCC DropCleanbuffers directives. Actual working server has over 2.3M records and also no option for droping buffs or cache since.. well.. it is in use by real users. So it was droped only once and i've got results right after that.
Here is actual queries and time it took on both servers:
Original:
DECLARE @dt datetime = CONVERT(datetime, '01.08.2013', 104)
SELECT *
FROM [CLIENTS_HISTORY].[dbo].[Clients_all_h] c
WHERE c.num = ( SELECT MAX(num)
FROM [CLIENTS_HISTORY].[dbo].[Clients_all_h] c2
WHERE c2.[AccountSys] = c.[AccountSys]
AND date_deal <= @dt)
AND c.type_deal <> 'D'
61sec @ 2'316'890rec on real one, 4sec @ 191'533 on test
Rahul's:
SELECT *
FROM [CLIENTS_HISTORY].[dbo].[Clients_all_h] c
GROUP BY [all_fields]
HAVING c.num = ( SELECT MAX(num)
FROM [CLIENTS_HISTORY].[dbo].[Clients_all_h] c2
WHERE c2.[AccountSys] = c.[AccountSys]
AND date_deal <= @dt)
AND c.type_deal <> 'D'
62sec @ 2'316'890rec on real one, 4sec @ 191'533 on test Almost equal
George's (with some major changes):
SELECT * FROM
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY accountsys ORDER BY num desc) AS aa
FROM [CLIENTS_HISTORY].[dbo].[Clients_all_h] c
WHERE c.date_deal < @dt) as a
WHERE aa=1
AND type_deal <> 'D'
76sec @ 2'316'890rec on real one, 5sec @ 191'533 on test
So far original and Rahul's are fastest and George's is not so fast.