3

This is a real brainteaser.

I want an elegant solution which simplifies overlapping time periods, the catch is that whilst some periods overlap, they do not necessarily form a continuous history between them. Take the below data for a Customer with multiple agreements, I want to identify a date history identifying the periods where the Customer had an active agreement (99999999 is a key used to identify something still active):

if object_id('tempdb..#t1') is not null
    drop table #t1
create table #t1 (CustomerKey int , ContractStartDateKey int, ContractEndDateKey int)

insert into #t1 (CustomerKey,ContractStartDateKey,ContractEndDateKey)

select 34,  20140103,   20150303
union
select 34,  20141121,   20150302
union
select 34,  20150430,   20161010
union
select 34,  20150901,   20161010
union
select 34,  20151113,   20161010
union
select 34,  20160713,   99999999
union
select 34,  20180202,   99999999
union
select 1,   20170120,   20170819
union
select 2,   20160105,   99999999
union
select 56,  20130406,   20140506
union
select 56,  20130806,   20141106

Gives results:

CustomerKey ContractStartDateKey    ContractEndDateKey
1           20170120                20170819
2           20160105                99999999
34          20140103                20150303
34          20141121                20150302
34          20150430                20161010
34          20150901                20161010
34          20151113                20161010
34          20160713                99999999
34          20180202                99999999
56          20130406                20140506
56          20130806                20141106

Desired outcome:

CustomerKey ContractStartDateKey    ContractEndDateKey
1           20170120                20170819
2           20160105                99999999
34          20140103                20150303
34          20150430                99999999
56          20130406                20141106

Any tips from anyone who had handled this sort thing before would be great. Size of data is small enough to permit loops if needed.

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52
TJB
  • 787
  • 1
  • 8
  • 29
  • You need a window functions with cte mixing to make such an aggregation, row_number() over (Partition by customerKey) and then join cte on itself – rgen3 Oct 04 '18 at 17:17
  • @TJB consider [referencing this post](https://stackoverflow.com/questions/12365992/sql-insert-multiple-rows) for inserting multiple rows in one insert statement. – PausePause Oct 04 '18 at 17:18
  • What version of SQL? This looks like a Gaps and Islands problem. – Shawn Oct 04 '18 at 17:59
  • This is a `gaps and islands problem`. Google that term and you will find lots of examples and solutions. – Tab Alleman Oct 04 '18 at 18:13
  • Thanks for the pointers chaps, but the recursive CTE method is super inefficient with the real-life date ranges and customer numbers I'm dealing with, if I find a solution I will report back. – TJB Oct 04 '18 at 19:42

0 Answers0