5

I have a SQL Server table with a date (session_date) and a value (price).

session_date price
2022-12-31 10
2022-12-30 9
2022-12-29 5
2022-12-28 7
2022-12-27 2
2022-12-26 9
2022-12-25 3
2022-12-24 1

I want to compute the rolling maximum over the last 3 rows (dates) and retrieve the session_date associated with each maximum:

session_date price rolling_max rolling_last_date
2022-12-31 4 9 2022-12-29
2022-12-30 4 9 2022-12-29
2022-12-29 9 9 2022-12-29
2022-12-28 7 7 2022-12-28
2022-12-27 2 4 2022-12-26
2022-12-26 4 4 2022-12-26
2022-12-25 3 3 2022-12-25
2022-12-24 1 1 2022-12-24

I'm using this query to compute the rolling maximum, but how can I get the associated session_date ?

select session_date
, price
, max(price) over (order by session_date rows between 3 preceding and current row) as rolling_max
from my_table
Charlieface
  • 52,284
  • 6
  • 19
  • 43
NassimH
  • 462
  • 3
  • 13

4 Answers4

3

Here is one method that avoids self-joins and complex LEAD/LAG expressions. It would work with simple data types like int, date, decimal. With float you need to be careful to not lose precision when converting to/from the text. I would argue that price should never be stored as float, I prefer money type to store this kind of data.

Idea: Combine values of two columns into a single column that can be used in a rolling MAX function and then split them back.

Here I combine them into an ordinary text varchar. If you use int and date you can combine into varbinary, which may be a bit more efficient.

REPLACE(STR(price, 20, 0), ' ', '0') + CONVERT(varchar(10), session_date, 126) AS combined

Convert price into a string, make sure you make it left-padded with zeros, so that it is sorted properly. Convert session_date into a string and attach it to the price. Later split this string back into corresponding values.

Sample data

DECLARE @my_table TABLE (
    session_date date,
    price decimal(15,2)
);

insert into @my_table values
('2022-12-31',  4),
('2022-12-30',  4),
('2022-12-29',  9),
('2022-12-28',  7),
('2022-12-27',  2),
('2022-12-26',  4),
('2022-12-25',  3),
('2022-12-24',  1);

Query

WITH
CTE
AS
(
    SELECT
        *
        ,max(combined) over (order by session_date DESC ROWS BETWEEN CURRENT ROW AND 3 FOLLOWING) as rolling_combined
    FROM
        @my_table AS T
        CROSS APPLY
        (
            SELECT
                REPLACE(STR(price, 20, 0), ' ', '0') + CONVERT(varchar(10), session_date, 126) AS combined
        ) AS A
)
SELECT
    session_date
    ,price
    ,CAST(LEFT(rolling_combined, 20) AS decimal(15,2)) AS rolling_price
    ,CAST(RIGHT(rolling_combined, 10) as date) AS rolling_date
    ,rolling_combined
FROM CTE
ORDER BY session_date DESC;

It is important to have the same ORDER BY in the main query and in the MAX to avoid second sort.

Result

I included the combined column in the output to show how it looks like.

+--------------+-------+---------------+--------------+--------------------------------+
| session_date | price | rolling_price | rolling_date |        rolling_combined        |
+--------------+-------+---------------+--------------+--------------------------------+
| 2022-12-31   | 4.00  | 9.00          | 2022-12-29   | 000000000000000000092022-12-29 |
| 2022-12-30   | 4.00  | 9.00          | 2022-12-29   | 000000000000000000092022-12-29 |
| 2022-12-29   | 9.00  | 9.00          | 2022-12-29   | 000000000000000000092022-12-29 |
| 2022-12-28   | 7.00  | 7.00          | 2022-12-28   | 000000000000000000072022-12-28 |
| 2022-12-27   | 2.00  | 4.00          | 2022-12-26   | 000000000000000000042022-12-26 |
| 2022-12-26   | 4.00  | 4.00          | 2022-12-26   | 000000000000000000042022-12-26 |
| 2022-12-25   | 3.00  | 3.00          | 2022-12-25   | 000000000000000000032022-12-25 |
| 2022-12-24   | 1.00  | 1.00          | 2022-12-24   | 000000000000000000012022-12-24 |
+--------------+-------+---------------+--------------+--------------------------------+
Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
  • 2
    Great answer. "It is important to have the same ORDER BY in the main query and in the MAX to avoid second sort." in other words: if you want to sort by `session_date ASC` then you need to change to `max(combined) over (order by session_date ASC ROWS 3 PRECEDING) as rolling_combined` – Charlieface Mar 30 '23 at 13:03
1

Try this :

with cte as (
  select session_date
  , price
  , max(price) over (order by session_date rows between 3 preceding and current row) as rolling_max  from my_table
)
select c.*, t.session_date as rolling_last_date
from cte c
inner join (
  select price, min(session_date) as session_date
  from my_table t
  group by price
) t on t.price = c.rolling_max 
order by session_date desc

To get rolling_last_date you will have to join your result with the list the smallest session_date by price using GROUP BY and MIN()

Result :

session_date    price   rolling_max rolling_last_date
2022-12-31      4       9           2022-12-29
2022-12-30      4       9           2022-12-29
2022-12-29      9       9           2022-12-29
2022-12-28      7       7           2022-12-28
2022-12-27      2       4           2022-12-26
2022-12-26      4       4           2022-12-26
2022-12-25      3       3           2022-12-25
2022-12-24      1       1           2022-12-24

Demo here

SelVazi
  • 10,028
  • 2
  • 13
  • 29
1

You can do this without any joins, purely by using window functions. You need to use a pile of LAG or LEAD functions to pick the date matching the highest price of the window.

You must do this with a derived table, as you first need to calculate the max price and the individual LAG or LEAD values.

SELECT
  session_date,
  price,
  maxPrice,
  dateOfMaxPrice =
    CASE maxPrice
    WHEN lagP1 THEN lagS1
    WHEN lagP2 THEN lagS2
    WHEN lagP3 THEN lagS3
    ELSE session_date
    END
FROM (
    SELECT *,
      maxPrice = MAX(price) OVER (ORDER BY session_date DESC ROWS BETWEEN CURRENT ROW AND 3 FOLLOWING),
      lagP1 = LEAD(price, 1) OVER (ORDER BY session_date DESC),
      lagP2 = LEAD(price, 2) OVER (ORDER BY session_date DESC),
      lagP3 = LEAD(price, 3) OVER (ORDER BY session_date DESC),
      lagS1 = LEAD(session_date, 1) OVER (ORDER BY session_date DESC),
      lagS2 = LEAD(session_date, 2) OVER (ORDER BY session_date DESC),
      lagS3 = LEAD(session_date, 3) OVER (ORDER BY session_date DESC)
    FROM my_table t
) t
ORDER BY
  session_date DESC;

db<>fiddle

Note that I have used LEAD with a descending sort, rather than LAG ascending, to match your outer ORDER BY, this means the server does not have to resort the rows after doing the calculations. The result is the same.

Charlieface
  • 52,284
  • 6
  • 19
  • 43
  • Seems promising, but in this situation it returns some NULL : [db<>fiddle](https://dbfiddle.uk/jqZ0arHF), do you have an idea ? – NassimH Mar 26 '23 at 21:52
  • See new edits.. – Charlieface Mar 27 '23 at 09:00
  • I struggle to understand why your first answer is not working with float / decimal and returns null . If I want to increase the windows size, for example to 100, it will not be convinent. – NassimH Mar 27 '23 at 15:42
  • 1
    Because `maxPrice` is being calculated *per row*, so when you do another `MAX(CASE` it only calculates it for that row. Do a `SELECT *` and you will se what is going on. If the window is large it's probably better to just use a self-join like the other answer. – Charlieface Mar 27 '23 at 16:47
  • @Charlieface, I think it is possible to avoid both self-joins and complex `LEAD/LAG` expressions. I added an answer – Vladimir Baranov Mar 30 '23 at 12:58
1

You can use your query within a CTE then use subquery to to add max session_date when price is the maxRollingPrice and session_date is within the same range.

create table my_table (
  session_date  date,
  price int
);

insert into my_table values
('2022-12-31',  4),
('2022-12-30',  4),
('2022-12-29',  9),
('2022-12-28',  7),
('2022-12-27',  2),
('2022-12-26',  4),
('2022-12-25',  3),
('2022-12-24',  1);

Query:

with maxRollingPrice as(
select session_date
, price
, max(price) over (order by session_date rows between 3 preceding and current row) as rolling_max 
  
from my_table)
select *,
  (select  max(session_date) from my_table  mt where datediff(day,mt.session_date,rp.session_date) between 0 and 3 and mt.price=rolling_max) rolling_last_date
from maxRollingPrice rp  
order by rp.session_date desc

Output:

session_date price rolling_max rolling_last_date
2022-12-31 4 9 2022-12-29
2022-12-30 4 9 2022-12-29
2022-12-29 9 9 2022-12-29
2022-12-28 7 7 2022-12-28
2022-12-27 2 4 2022-12-26
2022-12-26 4 4 2022-12-26
2022-12-25 3 3 2022-12-25
2022-12-24 1 1 2022-12-24

fiddle