33

I want to fill the calendar object which requires start and end date information. I have one column which contains a sequence of dates. Some of the dates are consecutive (have one day difference) and some are not.

InfoDate  

2013-12-04  consecutive date [StartDate]
2013-12-05  consecutive date
2013-12-06  consecutive date [EndDate]

2013-12-09                   [startDate]
2013-12-10                   [EndDate]

2014-01-01                   [startDate]
2014-01-02 
2014-01-03                   [EndDate]

2014-01-06                   [startDate]
2014-01-07                   [EndDate]

2014-01-29                   [startDate]
2014-01-30 
2014-01-31                   [EndDate]

2014-02-03                   [startDate]
2014-02-04                   [EndDate]

I want to pick each consecutive dates range’s start and end date (the first one and the last one in the block).

StartDate     EndDate

2013-12-04    2013-12-06
2013-12-09    2013-12-10
2014-01-01    2014-01-03
2014-01-06    2014-01-07
2014-01-29    2014-01-31
2014-02-03    2014-02-04

I want to solve the problem using SQL only.

Just a learner
  • 26,690
  • 50
  • 155
  • 234
Shamim
  • 461
  • 1
  • 7
  • 14
  • 1
    What do the empty lines in the second listing mean? Do you really need to solve this in SQL? This seems pretty hard to express in SQL (at least in the standard one), the obvious algorithm is pretty much sequetial and could be easily written in a procedural language. If SQL is really needed, I would use a stored procedure. – Palec Dec 05 '13 at 14:55

9 Answers9

65

No joins or recursive CTEs needed. The standard gaps-and-island solution is to group by (value minus row_number), since that is invariant within a consecutive sequence. The start and end dates are just the MIN() and MAX() of the group.

WITH t AS (
  SELECT InfoDate d,ROW_NUMBER() OVER(ORDER BY InfoDate) i
  FROM @d
  GROUP BY InfoDate
)
SELECT MIN(d),MAX(d)
FROM t
GROUP BY DATEDIFF(day,i,d)
TommCatt
  • 5,498
  • 1
  • 13
  • 20
Anon
  • 10,660
  • 1
  • 29
  • 31
  • 1
    I think the group by should be changed to: GROUP BY DATEADD(day,-i,d) – BennyBechDk Apr 09 '15 at 08:34
  • @BennyBechDk `GROUP BY DATEDIFF(day,i,d)` and `GROUP BY DATEADD(day,-i,d)` will generate identical groups. – Anon Apr 09 '15 at 17:03
  • Maybe been downvoted as you say "no need to use CTEs" - and then use a CTE! But you could of course substitute the CTE in for `t` in your final `SELECT` so you are still correct... – El Ronnoco Aug 26 '15 at 13:53
  • Hi TommCatt, sorry id doesn't work with INPUT in form StartDate and EndDate. – Mike S Apr 05 '19 at 18:25
  • Sorry, new to CTE. I got ERROR: column "day" does not exist. ``` WITH t AS ( SELECT date d,ROW_NUMBER() OVER(ORDER BY date) i FROM GROUP BY date ) SELECT MIN(d),MAX(d) FROM t GROUP BY DATEDIFF(DAY,i,d) ``` – sasawatc Jul 30 '19 at 04:07
  • 1
    Great solution. Also, if you change `ROW_NUMBER()` to `DENSE_RANK()`, then this code will work for the case where there are duplicate dates. – andrew Apr 14 '20 at 22:59
6

Here you go..

;WITH CTEDATES
AS
(
    SELECT ROW_NUMBER() OVER (ORDER BY Infodate asc ) AS ROWNUMBER,infodate FROM YourTableName  

),
 CTEDATES1
AS
(
   SELECT ROWNUMBER, infodate, 1 as groupid FROM CTEDATES WHERE ROWNUMBER=1
   UNION ALL
   SELECT a.ROWNUMBER, a.infodate,case datediff(d, b.infodate,a.infodate) when 1 then b.groupid else b.groupid+1 end as gap FROM CTEDATES A INNER JOIN CTEDATES1 B ON A.ROWNUMBER-1 = B.ROWNUMBER
)

select min(mydate) as startdate, max(infodate) as enddate from CTEDATES1 group by groupid
Sateesh Pagolu
  • 9,282
  • 2
  • 30
  • 48
  • 2
    You should be using `OVER (ORDER BY Infodate)` instead of `OVER (ORDER BY (SELECT 1))`. Also, change `min(mydate)` with `min(infodate)`. Other than that, this is a good answer – Lamak Dec 05 '13 at 14:50
1

--MS SQL

with cte as (
select start_date, end_date,
    dateadd(d, -row_number() over (order by start_date), start_date) as GRN
from projects)
select min(start_date), max(end_date) from cte group by grn order by grn;

--Oracle

with cte as(
select start_date, end_date, 
    start_date - row_number() over (order by start_date) as GRN 
    from projects)
select min(start_date), max(end_date) from cte  group by grn order by grn;
Rajesh
  • 562
  • 1
  • 10
  • 26
0

I have inserted these values into a table called #consec and then perforemed the following:

select t1.*
,t2.infodate as binfod
into #temp1
from #consec t1
left join #consec t2 on dateadd(DAY,1,t1.infodate)=t2.infodate

select t1.*
,t2.infodate as binfod
into #temp2
from #consec t1
left join #consec t2 on dateadd(DAY,1,t2.infodate)=t1.infodate
;with cte as(
select infodate,  ROW_NUMBER() over(order by infodate asc) as seq from #temp1
where binfod is null
),
cte2 as(
select infodate, ROW_NUMBER() over(order by infodate asc) as seq from #temp2
where binfod is null
)

select t2.infodate as [start_date]
,t1.infodate as [end_date] from cte t1
left join cte2 t2 on t1.seq=t2.seq 

As long as your date periods are not overlapping, that should do the job for you.

Kiril Rusev
  • 745
  • 3
  • 9
0

Here it is my sample with test data:

--required output
-- 01 - 03
-- 08 - 09
-- 12 - 14

DECLARE @maxRN int;
WITH #tmp AS (
                SELECT CAST('2013-01-01' AS date) DT
    UNION ALL   SELECT CAST('2013-01-02' AS date)
    UNION ALL   SELECT CAST('2013-01-03' AS date)
    UNION ALL   SELECT CAST('2013-01-05' AS date)
    UNION ALL   SELECT CAST('2013-01-08' AS date)
    UNION ALL   SELECT CAST('2013-01-09' AS date)
    UNION ALL   SELECT CAST('2013-01-12' AS date)
    UNION ALL   SELECT CAST('2013-01-13' AS date)
    UNION ALL   SELECT CAST('2013-01-14' AS date)
),
#numbered AS (
    SELECT 0 RN, CAST('1900-01-01' AS date) DT
    UNION ALL
    SELECT ROW_NUMBER() OVER (ORDER BY DT) RN, DT
    FROM #tmp
)

SELECT * INTO #tmpTable FROM #numbered;
SELECT @maxRN = MAX(RN) FROM #tmpTable;

INSERT INTO #tmpTable
SELECT @maxRN + 1, CAST('2100-01-01' AS date);

WITH #paired AS (
    SELECT 
    ROW_NUMBER() OVER(ORDER BY TStart.DT) RN, TStart.DT DTS, TEnd.DT DTE
    FROM #tmpTable TStart
    INNER JOIN #tmpTable TEnd 
    ON TStart.RN = TEnd.RN - 1
    AND DATEDIFF(dd,TStart.DT,TEnd.DT) > 1  
)

SELECT TS.DTE, TE.DTs 
FROM #paired TS
INNER JOIN #paired TE ON TS.RN = TE.RN -1
AND TS.DTE <> TE.DTs -- you could remove this filter if you want to have start and end on the same date

DROP TABLE #tmpTable

Replace #tmp data with your actual table.

LittleSweetSeas
  • 6,786
  • 2
  • 21
  • 26
0

You can do like this and here is the sqlfiddle

select
  min(ndate) as start_date,
  max(ndate) as end_date
from
(select
  ndate,
  dateadd(day, -row_number() over (order by ndate), ndate) as rnk
 from dates
 ) t
 group by
   rnk
zealous
  • 7,336
  • 4
  • 16
  • 36
0

Another simple solution that could work here is -

with tmp as 
(
select
datefield
, dateadd('day',-row_number() over(order by date asc),datefield) as date_group 
from table
)
select
min(datefield) as start_date
, max(datefield) as end_date 
from tmp
group by date_group
Shyam
  • 1
0

with cte as( select start_date, end_date, start_date - row_number() over (order by start_date) as GRN from projects)
select min(start_date), max(end_date) from cte
group by grn order by count(grn),min(start_date)

  • 2
    Thank you for contributing to the Stack Overflow community. This may be a correct answer, but it’d be really useful to provide additional explanation of your code so developers can understand your reasoning. This is especially useful for new developers who aren’t as familiar with the syntax or struggling to understand the concepts. **Would you kindly [edit] your answer to include additional details for the benefit of the community?** – Jeremy Caney Jul 14 '23 at 16:59
-1
SELECT InfoDate ,
    CASE
      WHEN TRUNC(InfoDate - 1) = TRUNC(lag(InfoDate,1,InfoDate) over (order by InfoDate))
      THEN NULL
      ELSE InfoDate
    END STARTDATE,
    CASE
      WHEN TRUNC(InfoDate + 1) = TRUNC(lead(InfoDate,1,InfoDate) over (order by InfoDate))
      THEN NULL
      ELSE InfoDate
    END ENDDATE
  FROM TABLE;
Guillaume S.
  • 1,515
  • 1
  • 8
  • 21