I have come across a strange behavior with the gaps and island solution. With 3 columns (3rd column being non integer), the result is random really. Let's suppose we the following query:
Declare @Table1 TABLE
(
ID varchar(50),
yr float,
CO1 varchar(50)
);
INSERT INTO @Table1 (ID, yr, CO1)
VALUES ('I2','2011','ABE'), ('I2','2012','ABE'), ('I2','2013','ABE'),
('I2','2014','ABE'), ('I2','2014','ABE'), ('I2','2005','ABD'),
('I2','2006','ABD'), ('I2','2007','ABD'), ('I2','2008','ABD'),
('I2','2007','ABA CD'), ('I2','2011','ABA CD'), ('I2','2013','ABA CD');
SELECT
ID, CO1, StartSeqNo = MIN(yr), EndSeqNo = MAX(yr)
FROM
(SELECT
ID, yr, CO1,
rn = yr - ROW_NUMBER() OVER (PARTITION BY ID ORDER BY yr)
FROM
@Table1) a
GROUP BY
ID, CO1, rn ;
The result I am aiming for is :
ID CO1 StartSeqNo EndSeqNo
----------------------------
I2 ABA CD 2007 2007
I2 ABA CD 2011 2011
I2 ABA CD 2013 2013
I2 ABD 2005 2008
I2 ABE 2011 2014
I have looked through stackoverflow and elsewhere to determine if I was missing something. I already tried with distinct and dense_rank, neither gives the proper result
Here are the distinct and dense_rank queries I've already tried:
--- distinct
SELECT distinct ID,CO1, StartSeqNo=MIN(yr), EndSeqNo=MAX(yr)
FROM (
SELECT distinct ID, yr, CO1
,rn=yr-ROW_NUMBER() OVER (PARTITION BY ID ORDER BY yr)
FROM @Table1) a
GROUP BY ID, CO1, rn ;
--- with dense_rank
SELECT ID,CO1, StartSeqNo=MIN(yr), EndSeqNo=MAX(yr)
FROM (
SELECT ID, yr, CO1
,rn=yr-dense_rank() OVER (PARTITION BY ID ORDER BY yr)
FROM @Table1) a
GROUP BY ID, CO1, rn ;
I dont see why the gaps and island query would not work with having a non-integer column. I reckon there is an issue with grouping somewhere. Please help me with this.
Sim