0

I need to group rows by some columns and by a running sum until it reaches a threshold. The closest I got was with a query based on this answer, but this solution is not as precise as I need it to be, because the sum has to be reset and restarted when it reaches the threshold.

Here's my attempt with some sample data and a threshold of 100:

drop table #table

create table #table (
     Id int not null,
     GroupId int not null,
     Code nvarchar(14) not null,
     Total int not null
)

insert into #table values
( 1, 1, '1111', 20),( 2, 1, '1111', 75),( 3, 1, '1111', 40),( 4, 1, '1111', 20),
( 5, 1, '1111', 20),( 6, 1, '1111', 25),( 7, 1, '2222', 20),( 8, 1, '2222', 20),
( 9, 1, '2222', 20),(10, 1, '2222', 20),(11, 2, '3333', 10),(12, 2, '3333', 90),
(13, 2, '3333', 90),(14, 2, '3333', 90),(15, 2, '3333', 90),(16, 2, '3333', 10),
(17, 2, '3333', 10),(18, 2, '3333', 10),(19, 2, '3333', 10),(20, 2, '3333', 10),
(21, 2, '3333', 10),(22, 2, '3333', 10),(23, 2, '3333', 10),(24, 2, '3333', 10),
(25, 2, '3333', 10),(26, 2, '3333', 10),(27, 2, '3333', 10),(28, 2, '3333', 10),
(29, 2, '3333', 10),(30, 2, '3333', 10),(31, 2, '3333', 10),(32, 2, '3333', 10),
(33, 2, '3333', 10),(34, 2, '3333', 10),(35, 2, '3333', 10)

;with cte as (
     select
         Id, GroupId, Code, Total,
         cast(sum(Total) OVER (ORDER BY Code, id) as int) / 100 AS Limit
     from #table
)
select
     *,
     dense_rank() over(ORDER BY Code, Limit) as Groups
from cte order by id

This is what I get, I manually added a "GroupsExpected" column to show what I actually need the groups to be.

| Id | GroupId | Code | Total | Limit | Groups | GroupsExpected |
|----|---------|------|-------|-------|--------|----------------|
|  1 |       1 | 1111 |    20 |     0 |      1 |              1 |
|  2 |       1 | 1111 |    75 |     0 |      1 |              1 |
|  3 |       1 | 1111 |    40 |     1 |      2 |              2 |
|  4 |       1 | 1111 |    20 |     1 |      2 |              2 |
|  5 |       1 | 1111 |    20 |     1 |      2 |              2 |
|  6 |       1 | 1111 |    25 |     2 |      3 |              3 |
|  7 |       1 | 2222 |    20 |     2 |      4 |              4 |
|  8 |       1 | 2222 |    20 |     2 |      4 |              4 |
|  9 |       1 | 2222 |    20 |     2 |      4 |              4 |
| 10 |       1 | 2222 |    20 |     2 |      4 |              4 |
| 11 |       2 | 3333 |    10 |     2 |      5 |              5 |
| 12 |       2 | 3333 |    90 |     3 |      6 |              6 |
| 13 |       2 | 3333 |    90 |     4 |      7 |              7 |
| 14 |       2 | 3333 |    90 |     5 |      8 |              8 |
| 15 |       2 | 3333 |    90 |     6 |      9 |              9 |
| 16 |       2 | 3333 |    10 |     6 |      9 |             10 |
| 17 |       2 | 3333 |    10 |     6 |      9 |             10 |
| 18 |       2 | 3333 |    10 |     6 |      9 |             10 |
| 19 |       2 | 3333 |    10 |     6 |      9 |             10 |
| 20 |       2 | 3333 |    10 |     7 |     10 |             10 |
| 21 |       2 | 3333 |    10 |     7 |     10 |             10 |
| 22 |       2 | 3333 |    10 |     7 |     10 |             10 |
| 23 |       2 | 3333 |    10 |     7 |     10 |             10 |
| 24 |       2 | 3333 |    10 |     7 |     10 |             10 |
| 25 |       2 | 3333 |    10 |     7 |     10 |             11 |
| 26 |       2 | 3333 |    10 |     7 |     10 |             11 |
| 27 |       2 | 3333 |    10 |     7 |     10 |             11 |
| 28 |       2 | 3333 |    10 |     7 |     10 |             11 |
| 29 |       2 | 3333 |    10 |     7 |     10 |             11 |
| 30 |       2 | 3333 |    10 |     8 |     11 |             11 |
| 31 |       2 | 3333 |    10 |     8 |     11 |             11 |
| 32 |       2 | 3333 |    10 |     8 |     11 |             11 |
| 33 |       2 | 3333 |    10 |     8 |     11 |             11 |
| 34 |       2 | 3333 |    10 |     8 |     11 |             12 |
| 35 |       2 | 3333 |    10 |     8 |     11 |             12 |

The sum of the Total column for each group can't hit 100, and groups '9' and '10' hit that amount (their sum is 130 and 100 respectively).

I also tried to use a recursive CTE from this answer, but in that case I couldn't group by the Code column.

I could do this programmatically, but I have a feeling this can be achieved more easily with a single query.

I'm using MSSQL 2016.

Marcos Dimitrio
  • 6,651
  • 5
  • 38
  • 62
  • Maybe recursive cte? I'm not sure. I'm tending to think cursor (yuk). – avery_larry Oct 31 '19 at 16:45
  • You say "threshold of 100" and "can't exceed 100". You point out that group 10 is not acceptable because it equals 100 -- but equalling 100 is NOT exceeding 100. Since you indicate in your expected results that the total of 100 is not acceptable, I have taken that to mean "the running total must be less than 100", meaning exactly 100 is not acceptable. – avery_larry Oct 31 '19 at 17:27
  • You also haven't said anything about what to do if an individual total is 100 or more. – avery_larry Oct 31 '19 at 17:29

3 Answers3

1

Here's a recursive CTE. It requires the ID to be incremental, with no gaps, and in the order you want. This is true in your sample data. However, if it's not guaranteed to be like that in your actual data, then you'd have to use a subquery with row_number to get a sequential number in the order you want.

declare @table table (
     Id int not null,
     GroupId int not null,
     Code nvarchar(14) not null,
     Total int not null
)

insert into @table values
( 1, 1, '1111', 20),( 2, 1, '1111', 75),( 3, 1, '1111', 40),( 4, 1, '1111', 20),
( 5, 1, '1111', 20),( 6, 1, '1111', 25),( 7, 1, '2222', 20),( 8, 1, '2222', 20),
( 9, 1, '2222', 20),(10, 1, '2222', 20),(11, 2, '3333', 10),(12, 2, '3333', 90),
(13, 2, '3333', 90),(14, 2, '3333', 90),(15, 2, '3333', 90),(16, 2, '3333', 10),
(17, 2, '3333', 10),(18, 2, '3333', 10),(19, 2, '3333', 10),(20, 2, '3333', 10),
(21, 2, '3333', 10),(22, 2, '3333', 10),(23, 2, '3333', 10),(24, 2, '3333', 10),
(25, 2, '3333', 10),(26, 2, '3333', 10),(27, 2, '3333', 10),(28, 2, '3333', 10),
(29, 2, '3333', 10),(30, 2, '3333', 10),(31, 2, '3333', 10),(32, 2, '3333', 10),
(33, 2, '3333', 10),(34, 2, '3333', 10),(35, 2, '3333', 10)

;with rcte as (
    select id, groupid, code, total, total as runtotal, 1 as groups
    from @table
    where id=1

    union all

    select t.id, t.groupid, t.code, t.total,
       case when r.runtotal + t.total >= 100 or r.code <> t.code
            then t.total
            else r.runtotal + t.total
       end as runtotal,
       case when r.runtotal + t.total >= 100 or r.code <> t.code
            then groups + 1
            else groups
       end as groups
    from rcte r
    inner join @table t on t.id = r.id + 1
)
select id, groupid, code, total, groups
from rcte
order by id
avery_larry
  • 2,069
  • 1
  • 5
  • 17
1

You can do this using a recursive CTE:

with tt as (
      select t.*, row_number() over (order by id) as seqnum
      from t
     ),
     cte as (
      select id, groupid, code, total, total as totaltotal, 1 as grp, tt.seqnum
      from tt
      where seqnum = 1
      union all
      select tt.id, tt.groupid, tt.code, tt.total,
             (case when cte.totaltotal + tt.total > 100 or cte.groupid <> tt.groupid or cte.code <> tt.code
                   then tt.total else totaltotal + tt.total
              end),
             (case when cte.totaltotal + tt.total > 100 or cte.groupid <> tt.groupid or cte.code <> tt.code
                   then cte.grp + 1 else cte.grp
              end),
             tt.seqnum
      from cte join
           tt
           on tt.seqnum = cte.seqnum + 1
     )
select *
from cte
order by id;

Here is a db<>fiddle.

Unfortunately, there is not a "set-based" approach to this.

You could speed it up if you accepted restarting the numbers for each groupid/code combination.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Here it is with a cursor.

declare @table table (
     Id int not null,
     GroupId int not null,
     Code nvarchar(14) not null,
     Total int not null
)

insert into @table values
( 1, 1, '1111', 20),( 2, 1, '1111', 75),( 3, 1, '1111', 40),( 4, 1, '1111', 20),
( 5, 1, '1111', 20),( 6, 1, '1111', 25),( 7, 1, '2222', 20),( 8, 1, '2222', 20),
( 9, 1, '2222', 20),(10, 1, '2222', 20),(11, 2, '3333', 10),(12, 2, '3333', 90),
(13, 2, '3333', 90),(14, 2, '3333', 90),(15, 2, '3333', 90),(16, 2, '3333', 10),
(17, 2, '3333', 10),(18, 2, '3333', 10),(19, 2, '3333', 10),(20, 2, '3333', 10),
(21, 2, '3333', 10),(22, 2, '3333', 10),(23, 2, '3333', 10),(24, 2, '3333', 10),
(25, 2, '3333', 10),(26, 2, '3333', 10),(27, 2, '3333', 10),(28, 2, '3333', 10),
(29, 2, '3333', 10),(30, 2, '3333', 10),(31, 2, '3333', 10),(32, 2, '3333', 10),
(33, 2, '3333', 10),(34, 2, '3333', 10),(35, 2, '3333', 10)

select *
from @table
order by code,id


declare @runtotal int = 0
declare @groups int = 0
declare @code nvarchar(14)
declare @currentcode nvarchar(14) = ''
declare @total int
declare @id int

declare @output table (
     Id int not null,
     Groups int not null
)

declare cursor_table cursor
    for select id, code, total
        from @table
        order by code,id

open cursor_table
fetch next from cursor_table into @id, @code, @total

while @@fetch_status = 0
    begin
        set @runtotal += @total
        if @runtotal >= 100 or @code <> @currentcode
            begin
                set @runtotal = @total
                set @groups += 1
                set @currentcode = @code
            end
        insert into @output
        select @id,@groups
        fetch next from cursor_table into @id, @code, @total
    end

select t.*,groups
from @table t
inner join @output o on o.id=t.id

close cursor_table
deallocate cursor_table
avery_larry
  • 2,069
  • 1
  • 5
  • 17