1

I need to determine start and end dates from the following data set:

item        date              cost         
12345    01/01/15             2.00
12345    01/02/15             2.00
12345    01/03/15             2.00
12345    01/04/15             2.00
12345    01/05/15             2.00
12345    01/06/15             2.00
12345    01/07/15             1.50
12345    01/08/15             1.50
12345    01/09/15             1.50
12345    01/10/15             1.50
12345    01/11/15             1.50
12345    01/12/15             1.50
12345    01/13/15             1.50
12345    01/14/15             2.00
12345    01/15/15             2.00
12345    01/16/15             2.00
12345    01/17/15             2.00
12345    01/18/15             2.00
12345    01/19/15             2.00
12345    01/20/15             2.00
12345    01/26/15             2.00
12345    01/27/15             2.00
12345    01/28/15             2.00
12345    01/29/15             2.00

If possible, what I want is the following output:

item        start              end                  cost
12345    01/01/15             01/06/15             2.00
12345    01/07/15             01/13/15             1.50
12345    01/14/15             01/20/15             2.00
12345    01/26/15             01/29/15             2.00

Basically, any time the changes, or there is more than a 4 day gap between the dates at any given cost.

Thanks.

JNevill
  • 46,980
  • 4
  • 38
  • 63
Joel C
  • 11
  • 1

2 Answers2

0

Here's one way you could do it:

DECLARE @myTable TABLE (item INT, date DATE, cost DECIMAL(5, 2));
INSERT @myTable VALUES (12345, '01/01/15', 2.00), (12345, '01/02/15', 2.00)
, (12345, '01/03/15', 2.00), (12345, '01/04/15', 2.00), (12345, '01/05/15', 2.00)
, (12345, '01/06/15', 2.00), (12345, '01/07/15', 1.50), (12345, '01/08/15', 1.50)
, (12345, '01/09/15', 1.50), (12345, '01/10/15', 1.50), (12345, '01/11/15', 1.50)
, (12345, '01/12/15', 1.50), (12345, '01/13/15', 1.50), (12345, '01/14/15', 2.00)
, (12345, '01/15/15', 2.00), (12345, '01/16/15', 2.00), (12345, '01/17/15', 2.00)
, (12345, '01/18/15', 2.00), (12345, '01/19/15', 2.00), (12345, '01/20/15', 2.00)
, (12345, '01/26/15', 2.00), (12345, '01/27/15', 2.00), (12345, '01/28/15', 2.00)
, (12345, '01/29/15', 2.00);

WITH CTE1 AS (
    SELECT item, date, cost, ROW_NUMBER() OVER (PARTITION BY item ORDER BY date) RN
    FROM @myTable)
, CTE2 AS (
    SELECT item, T.date TDate, T.cost Tcost, prevRow.date SDate, MD.maxDate, ROW_NUMBER() OVER (PARTITION BY item ORDER BY T.date) RN2
    FROM CTE1 T
    OUTER APPLY (
        SELECT date, cost
        FROM CTE1
        WHERE RN = T.RN - 1
        AND item = T.item
        ) prevRow
    OUTER APPLY (
        SELECT MAX(date) FROM CTE1
        ) MD(maxDate)
    WHERE CASE WHEN DATEDIFF(DAY, prevRow.date, T.date) <= 4 THEN T.date END IS NULL
    OR CASE WHEN prevRow.cost = T.cost THEN T.cost END IS NULL)
SELECT item, TDate startDate, ISNULL(nextRow.SDate, C.maxDate) endDate, TCost cost
FROM CTE2 C
OUTER APPLY (
    SELECT SDate
    FROM CTE2
    WHERE RN2 = C.RN2 + 1
    AND item = C.item) nextRow;

Essentially, you want to compare each row with the next to check whether there's either a difference in cost or a date difference of more than 4 days. In newer versions of SQL Server (2012 onwards) you can use LEAD/LAG window functions for this, but I think in SQL Server 2008 probably the easiest way is to assign each row a row number and use applies.

CTE1 assigns the row numbers. CTE2 uses applies to compare with the next row. The select statement at the end pulls out the start and end date of each period where either cost changes or there's a date skip greater than 4 days.

ZLK
  • 2,864
  • 1
  • 10
  • 7
0

Here is a variant on what @ZLK has provided but using joins only without apply. Again we apply use row numbers. 2nd CTE gets first and last rows and each row where there is a change. The final query takes care of selecting the correct final date.

--set up data
declare @table table (item int, tdate date, cost decimal(5,2));
insert @table values
(12345,'01/01/15',2.00),
(12345,'01/02/15',2.00),
(12345,'01/03/15',2.00),
(12345,'01/04/15',2.00),
(12345,'01/05/15',2.00),
(12345,'01/06/15',2.00),
(12345,'01/07/15',1.50),
(12345,'01/08/15',1.50),
(12345,'01/09/15',1.50),
(12345,'01/10/15',1.50),
(12345,'01/11/15',1.50),
(12345,'01/12/15',1.50),
(12345,'01/13/15',1.50),
(12345,'01/14/15',2.00),
(12345,'01/15/15',2.00),
(12345,'01/16/15',2.00),
(12345,'01/17/15',2.00),
(12345,'01/18/15',2.00),
(12345,'01/19/15',2.00),
(12345,'01/20/15',2.00),
(12345,'01/26/15',2.00),
(12345,'01/27/15',2.00),
(12345,'01/28/15',2.00),
(12345,'01/29/15',2.00);
--select * from @table order by tdate;

--query
with cte as (
        select  *, row_number() over (partition by item order by tdate) row_num
        from    @table
),
cte2 as (
        select  c.item, c.tdate, c_pre.tdate pre_date, c_post.tdate post_date, c.cost, row_number() over (partition by c.item order by c.tdate) row_num
        from    cte c
        left    join cte c_pre
                on  c_pre.row_num = c.row_num - 1
        left    join cte c_post
                on  c_post.row_num = c.row_num + 1
        where   c_pre.row_num is null   --first row
        or      c_post.row_num is null  --last row
        or      c_pre.cost != c.cost    --cost difference
        or      datediff(d, c_pre.tdate, c.tdate) > 4   --more than 4 days
)
select  c.item, c.tdate start, 
        case when c_post.post_date is null then c_post.tdate else c_post.pre_date end [end],    --adjustment for last row
        c.cost
from    cte2 c
left    join cte2 c_post
        on  c_post.row_num = c.row_num + 1
where   c.post_date is not null
order   by c.tdate;
SMM
  • 2,225
  • 1
  • 19
  • 30
  • Thank you! I'll try that solution as well. ZLK's worked for my needs with just a bit of tweaking so thanks to him/her as well. – Joel C Sep 08 '16 at 21:25
  • If either answer is helpful make sure to upvote them. If one is the one you use you can mark it as accepted for future users. – SMM Sep 08 '16 at 21:32
  • both actually worked for me with negligible performance difference. thanks to both :) – Joel C Sep 09 '16 at 13:20