1

I have a table that contains parts of projects with their start and end dates. This is in SQL Server 2017. The task is to list start, end dates and duration by project. (A project consists of all lines where start_date of next row is the same as end_date of previous. No lines overlap)

With this, you can reproduce the same sample data I'm working with.

CREATE TABLE PROJECTS (
Task_id int PRIMARY KEY ,
Start_Date date,
End_Date date )

INSERT INTO PROJECTS
VALUES 
(1, '01/01/2020','01/02/2020'),
(2, '01/02/2020','01/03/2020'),
(3, '01/03/2020','01/04/2020'),

(4, '01/06/2020','01/07/2020'),
(5, '01/07/2020','01/08/2020'),

(6, '01/10/2020','01/11/2020'),

(7, '01/12/2020','01/13/2020'),

(8, '01/21/2020','01/22/2020'),
(9, '01/22/2020','01/23/2020'),
(10,'01/23/2020','01/24/2020')

based on the above data, my output would be as follows:

Project_id  Start_Date  End_Date    Duration_days
1           01/01/2020  04/01/2020  3
2           06/01/2020  08/01/2020  2
3           10/01/2020  11/01/2020  1
4           12/01/2020  13/01/2020  1
5           21/01/2020  24/01/2020  3

The code below represents the first part of the execution, calculation of difference between end dates.

select projects.*,
   datediff(day, end_date,
            lead(end_date) over (partition by task_id order by end_date)
           ) as diff
from projects;

This is what I expected:

Task_id Start_Date  End_Date    diff
1       2020-01-01  2020-01-02  1
2       2020-01-02  2020-01-03  1
3       2020-01-03  2020-01-04  1
4       2020-01-06  2020-01-07  3
5       2020-01-07  2020-01-08  1
6       2020-01-10  2020-01-11  3
7       2020-01-12  2020-01-13  2
8       2020-01-21  2020-01-22  9
9       2020-01-22  2020-01-23  1
10      2020-01-23  2020-01-24  1

what I get is NULL in every row of Diff column.

halfer
  • 19,824
  • 17
  • 99
  • 186
Lana B
  • 496
  • 6
  • 17
  • 1
    is row 8 incorrect? shouldnt it be 9? – Doug Coats Feb 07 '20 at 19:42
  • 2
    https://stackoverflow.com/questions/20402089/detect-consecutive-dates-ranges-using-sql. This appears to be a standard gaps-and-islands problem. This link may help. – JJ32 Feb 07 '20 at 19:43
  • 1
    So, a couple of things. You have 10 distinct `Task_id` values, so when you partition over them, the `LEAD` values will always be `NULL`, which you're seeing. Is that what you intended? Do you really mean to be taking the `DATEDIFF` from `end_date` to `end_date`? – Eric Brandt Feb 07 '20 at 19:43
  • @DougCoats yes, i was calculating it in my head - 9 is correct. Thank you – Lana B Feb 07 '20 at 20:30
  • @JJ32 in that code, @d is not recognised, no var is declared, cannot try it. – Lana B Feb 07 '20 at 20:31
  • @EricBrandt, end date vs end date of previous row. could be end_date vs start date of next row, as long as it's not the same - it would work. – Lana B Feb 07 '20 at 20:32
  • @JJ32 in that link, the second answer returns me a final output i'm trying to get to, except the start dates (since they are based on end-date) are 1 day less than they should. But it's a good start, I can make it as a CTE and then adjust it in the outer select statement plus add the duration column. Thanks! – Lana B Feb 07 '20 at 20:42

1 Answers1

0

select a.*, case when a.Task_id = 1 then 1 else datediff(day,b.End_Date, a.end_date ) end as diff from #projects a left join #projects b on b.Task_id = a.Task_id-1

aschul
  • 1
  • 1