10

I have a set of date ranges consisting of both partially and fully overlapping dates, like this:

UserID  StartDate   EndDate 
======  ==========  ==========
1       2011-01-01  2011-01-02  <- A
1       2011-01-01  2011-01-10  <- A
1       2011-01-08  2011-02-15  <- A
1       2011-02-20  2011-03-10  <- B
2       2011-01-01  2011-01-20  <- C
2       2011-01-15  2011-01-25  <- C

Using T-SQL, I would like to create a new set of data, per user, with eliminated overlapping data, extending ranges and removing redundant data where needed, resulting in something like this:

UserID  StartDate   EndDate 
======  ==========  ==========
1       2011-01-01  2011-02-15 ('A', three rows combined, extending the range)
1       2011-02-20  2011-03-10 ('B', no change, no overlaps here)
2       2011-01-01  2011-01-25 ('C', two rows combined)

Cursors are fine if needed, but if I can do without them that would be even better.

Jakob Möllås
  • 4,239
  • 3
  • 33
  • 61

1 Answers1

15

For SQL Server 2005+

-- sample table with data
declare @t table(UserID int, StartDate datetime, EndDate datetime)
insert @t select
1, '20110101', '20110102' union all select
1, '20110101', '20110110' union all select
1, '20110108', '20110215' union all select
1, '20110220', '20110310' union all select
2, '20110101', '20110120' union all select
2, '20110115', '20110125'

-- your query starts below

select UserID, Min(NewStartDate) StartDate, MAX(enddate) EndDate
from
(
    select *,
        NewStartDate = t.startdate+v.number,
        NewStartDateGroup =
            dateadd(d,
                    1- DENSE_RANK() over (partition by UserID order by t.startdate+v.number),
                    t.startdate+v.number)
    from @t t
    inner join master..spt_values v
      on v.type='P' and v.number <= DATEDIFF(d, startdate, EndDate)
) X
group by UserID, NewStartDateGroup
order by UserID, StartDate

Notes:

  1. Replace @t with your table name
RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
  • 1
    Great stuff! Works like a charm! I will have to look into the usage of **DENSE_RANK()**, that was new to me. Thanks! – Jakob Möllås Mar 06 '11 at 21:57
  • 2
    It's worth mentioning that for date spans greater than the number of rows returned by `master..spt_values` this will not work. In this case, you could cross join that table to itself to give a greater window size. – Chris Pickford Jul 22 '16 at 11:28
  • Some DENSE_RANK() docs: https://learn.microsoft.com/en-us/sql/t-sql/functions/dense-rank-transact-sql – Westy92 Jul 19 '17 at 22:12