SQL Query for Grouping the results based on sequence

Question

I have a table like this:

ID  Seq  Amt
1   1    500
1   2    500
1   3    500
1   5    500
2   10   600
2   11   600
3   1    700
3   3    700

I want to group the continuous sequence numbers into a single row like this:

ID  Start  End  TotalAmt
1   1      3    1500
1   5      5    500
2   10     11   1200
3   1      1    700
3   3      3    700

Please help to achieve this result.

which version of sql server? is (id+seq) unique? – RichardTheKiwi Feb 23 '11 at 08:45 — RichardTheKiwi, Feb 23 '11 at 08:45

Andriy M · Accepted Answer · 2013-08-30T10:21:24.220

22

WITH numbered AS (
  SELECT
    ID, Seq, Amt,
    SeqGroup = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Seq) - Seq
  FROM atable
)
SELECT
  ID,
  Start = MIN(Seq),
  [End] = MAX(Seq),
  TotalAmt = SUM(Amt)
FROM numbered
GROUP BY ID, SeqGroup
ORDER BY ID, Start
;

edited Aug 30 '13 at 10:21

answered Feb 23 '11 at 08:22

Andriy M

76,112
17
94
154

2

+1 Took me a while to figure out the `- Seq` trick and verify (for myself) there can't be any collisions but OP can't ask for any better than this. – Lieven Keersmaekers Feb 23 '11 at 08:39
+1 I had a suspicion that ROW_NUMBER (or RANK,...) would help but abandoned it for my (more complicated and probably slower) solution. – Sem Vanmeenen Feb 23 '11 at 08:50
@Lieven: Please believe me when I'm saying that with a praise like this one doesn't need any formal upvoting. :) Thanks, you are very kind! By the way, you shouldn't have deleted your answer. One more recursive CTE demonstration is never too many, in my opinion, as that is a rather tricky technique, which is best learnt by examples. Also I didn't notice any Martin's or Quassnoi's presence here. :) – Andriy M Feb 23 '11 at 08:51
@Andriy, I've deleted mine to avoid future embarassement :). *...I didn't notice...* I'll just have to add you to my list for future reference. – Lieven Keersmaekers Feb 23 '11 at 08:56
@Andriy, @Lieven - undoubtedly a good solution. Would be funny for OP to come back with sql2000. I posted another option – RichardTheKiwi Feb 23 '11 at 09:22
@Lieven: What an honour, thanks for the warning. :) I'll just have to remember to upvote your reasonable answers *before* posting mine if any (just in case), in the hope you wouldn't dare delete them as easily. – Andriy M Feb 23 '11 at 09:31
@Andriy: I already have table with data. Plealse modify the same query to select from the existing table instead of using UNION ALL. Please help – Nagesh Feb 23 '11 at 11:20
@Nagesh: There. Removed the sample data CTE, but you still have to replace `atable` with the actual table name. By the way, a correct answer doesn't have to fit your actual environment *literally*. As long as it is basically a solution to your problem, you are free to accept it as *the* solution. – Andriy M Feb 23 '11 at 12:12
1

That is an elegant and beautiful solution. Excellent. – Matthew Burr Feb 24 '11 at 02:40
I need to select ranges based on some condition like `((MAX(Seq)-MIN(Seq)+1)/10)>0` Is it possible? – Santhosh Nayak Nov 17 '16 at 06:22
@SanthoshNayak: That can make a good question. Feel free to submit one, I'm sure there'll be some suggestions. – Andriy M Nov 17 '16 at 06:49
@AndriyM Got solution. made second query as inner query. and used `PARTITION BY` instead of `GROUP BY` and put my condition for outer query. – Santhosh Nayak Nov 17 '16 at 07:10
1

@SanthoshNayak: Came to the same solution in my mind myself. You could still post *and* [self-answer](http://meta.stackexchange.com/questions/17463/can-i-answer-my-own-questions-even-if-i-knew-the-answer-before-asking) that question. Self-answering is encouraged here, you know. (Well, it certainly is if the question and the answer are laid out well, anyway.) It's no pressure, just keep that option in mind. – Andriy M Nov 17 '16 at 07:57

Sem Vanmeenen · Answer 2 · 2011-02-23T07:56:54.797

This seems to work nicely. @breakingRows will contain all rows that break the sequence of id and seq (i.e. if id changes or if seq is not 1 more than the previous seq). With that table you can select all rows of such a sequence within @temp. I must add however that performance will probably be not all that good because of all the subqueries but you'll need to test to be sure.

declare @temp table (id int, seq int, amt int)
insert into @temp select 1, 1, 500
insert into @temp select 1, 2, 500
insert into @temp select 1, 3, 500
insert into @temp select 1, 5, 500
insert into @temp select 2, 10, 600
insert into @temp select 2, 11, 600
insert into @temp select 3, 1, 700
insert into @temp select 3, 3, 700

declare @breakingRows table (ctr int identity(1,1), id int, seq int)

insert into @breakingRows(id, seq)
select id, seq
from @temp t1 
where not exists 
    (select 1 from @temp t2 where t1.id = t2.id and t1.seq - 1 = t2.seq)
order by id, seq

select br.id, br.seq as start, 
       isnull ((select top 1 seq from @temp t2 
               where id < (select id from @breakingRows br2 where br.ctr = br2.ctr - 1) or 
                     (id = (select id from @breakingRows br2 where br.ctr = br2.ctr - 1) and
                      seq < (select seq from @breakingRows br2 where br.ctr = br2.ctr - 1))          
               order by id desc, seq desc),
               br.seq)
      as [end],
      (select SUM(amt) from @temp t1 where t1.id = br.id and 
        t1.seq < 
            isnull((select seq from @breakingRows br2 where br.ctr = br2.ctr - 1 and br.id = br2.id), 
                   (select max(seq) + 1 from @temp)) and 
        t1.seq >= br.seq)
from @breakingRows br
order by id, seq

Thanks for the effort. The solution is perfectly working! But I can mark only one answer as Best answer. — Nagesh, Feb 23 '11 at 10:43

score 1 · Answer 3 · answered Feb 23 '11 at 07:48

Well, there's perhaps a more elegant way to do this (something hints at me that there is), but here's an approach that will work if you're using a version of SQL Server that accepts common table expressions:

use Tempdb
go

create table [Test]
(
    [id] int not null,
    [Seq] int not null,
    [Amt] int not null
)

insert into [Test] values
(1, 1, 500),
(1, 2, 500),
(1, 3, 500),
(1, 5, 500),
(2, 10, 600),
(2, 11, 600),
(3, 1, 700),
(3, 3, 700)

;with
lower_bound as (
    select *
      from Test
     where not exists (
        select *
          from Test as t1
         where t1.id = Test.id and t1.Seq = Test.Seq - 1
    )
),
upper_bound as (
    select *
      from Test
     where not exists (
        select *
          from Test as t1
         where t1.id = Test.id and t1.Seq = Test.Seq + 1
    )
),
bounds as (
    select id, (select MAX(seq) from lower_bound where lower_bound.id = upper_bound.id and lower_bound.Seq <= upper_bound.Seq) as LBound, Seq as Ubound
      from upper_bound
)
select Test.id, LBound As [Start], UBound As [End], SUM(Amt) As TotalAmt
  from Test
  join bounds
    on Test.id = bounds.id
   and Test.Seq between bounds.LBound and bounds.Ubound
 group by Test.id, LBound, UBound

drop table [Test]

Thank you very much for the effort. Your solution is working fine. — Nagesh, Feb 23 '11 at 11:01

score 1 · Answer 4 · answered Feb 23 '11 at 09:20

1

Since Andriy has already posted the gold solution, here's my take using an UPDATE statement to get the result from a temp table, just for fun.

declare @tmp table (
    id int, seq int, amt money, start int, this int, total money,
    primary key clustered(id, seq))
;
insert @tmp
select *, start=seq, this=seq, total=convert(money,amt)
from btable
;
declare @id int, @seq int, @start int, @amt money
update @tmp
set 
    @amt = total = case when id = @id and seq = @seq+1 then @amt+total else amt end,
    @start = start = case when id = @id and seq = @seq+1 then @start else seq end,
    @seq = this = seq,
    @id = id = id
from @tmp
option (maxdop 1)
;
select id, start, max(this) [end], max(total) total
from @tmp
group by id, start
order by id, start

Notes:

btable: your table name
id int, seq int, amt money: expected columns in your table

answered Feb 23 '11 at 09:20

RichardTheKiwi

105,798
26
196
262

It works, but that's some new SQL Server stuff to me. You've managed to post something valueable just by having fun, thanks. :) – Andriy M Feb 23 '11 at 09:39
Is there a specific reason for the ´option (maxdop 1)´ ? EDIT : I think I just figured it out. You update ´@tmp´ with data from ´@tmp´ itself. Multithreading would interfere with that, correct ? – Sem Vanmeenen Feb 23 '11 at 10:15
@Sem Yes, because this is a brittle query. There is no way to use ORDER BY in an UPDATE query, so AFAIK this should work - but no guarantees - if we cluster in the order required, and force maxdop 1. **I stress again** (and voted too), Andriy's answer is the gold one. – RichardTheKiwi Feb 23 '11 at 10:18

score 0 · Answer 5 · answered Feb 23 '11 at 06:59

0

Try following query.

select id, min(seq), max(seq), sum(amt) from table group by id

OOps, sorry, it is wrong query as you need sequence

answered Feb 23 '11 at 06:59

Danil

1,883
1
21
22

score 0 · Answer 6 · answered Jun 28 '18 at 17:54

0

SELECT Id, MIN(Seq) as Start, MAX(Seq) as End, SUM(Amount) as Total
FROM ( 
        SELECT t.*, Seq - ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Seq) Rn
        FROM [Table] t
    ) as T
GROUP BY Id, Rn
ORDER BY Id, MIN(Seq)

answered Jun 28 '18 at 17:54

Dmitry Stepanov

2,776
8
29
45

SQL Query for Grouping the results based on sequence

6 Answers6

Linked