MSSQL Group until type is changed

Question

I currently have this table:

╔════╦══════════╦════════════╗
║ ID ║ PartType ║ PartStatus ║
╠════╬══════════╬════════════╣
║  1 ║ A        ║ OK         ║
║  2 ║ A        ║ BAD        ║
║  3 ║ A        ║ OK         ║
║  4 ║ A        ║ OK         ║
║  5 ║ B        ║ OK         ║
║  6 ║ B        ║ BAD        ║
║  7 ║ A        ║ OK         ║
╚════╩══════════╩════════════╝

I want to be able to group them by PartType UNTIL it changes. So it should output like this:

╔══════════╦══════════╗
║ PartType ║ Quantity ║
╠══════════╬══════════╣
║ A        ║        4 ║
║ B        ║        2 ║
║ A        ║        1 ║
╚══════════╩══════════╝

What version of SQL Server are you using and what have you tried before asking the question? — Radu Gheorghiu, May 03 '16 at 11:53
SQL is set-based, it has no notion of "until" unless you introduce an order. By what do you want to order -- ID? I'd almost want to ask "why", since it seems like you're organizing the data in a way that's not meaningful. — Jeroen Mostert, May 03 '16 at 11:57
I want to order by ID, initially. The why is because we have a production line and we need to know how many pieces he produced before changing, to cross check with our planning, and verify they have reached their goal before changing. — Felipe Deguchi, May 03 '16 at 12:09
Alright, that means the ID should really be a timestamp, but without an explicit field for that you could indeed use the ID as an approximation (if rows were inserted sequentially). — Jeroen Mostert, May 03 '16 at 12:10
just few days back i solve similar problem in my own way.........http://stackoverflow.com/questions/36927685/count-number-of-consecutive-occurance-of-values-in-table/36930349#36930349 — KumarHarsh, May 03 '16 at 12:26

score 3 · Answer 1 · answered May 03 '16 at 12:30

You can also do this kind of groupings using row_number, which should work a lot better with bigger data set since you don't have to do any joins. This should also return the expected result:

select PartType, count(*)
from (
  select *, 
    row_number() over (order by ID) as RN1, 
    row_number() over (partition by PartType order by ID) as RN2
  from yourtable
) X
group by PartType, RN1 - RN2
order by min(ID)

The trick here is that the first row number numbers all the rows, and the second partitions them by PartType. So when the difference between RN1 and RN2 changes, it's a different type.

The clever thing is not that it detects the partition switches, but that `GROUP BY RN1 - RN2` actually works no matter how the partitions are arranged. — Jeroen Mostert, May 03 '16 at 13:17

score 3 · Accepted Answer · answered May 03 '16 at 13:33

If you are using SQL Server 2012 or higher then another approach worth mentioning is to make use of the windowing functions available in 2012.

You can use the LAG function to detect when a state change has occurred within your dataset and you can use the SUM OVER clause to generate a grouping id for your data. The following example demonstrates how this can be done.

    DECLARE @parts TABLE
    (
        ID int IDENTITY(1,1) NOT NULL PRIMARY KEY,
        PartType nvarchar(1) NOT NULL,
        PartStatus nvarchar(50) NOT NULL
    )

    INSERT INTO @parts (PartType,PartStatus)
    VALUES 
    (N'A',N'OK'),
    (N'A',N'BAD'),
    (N'A',N'OK'),
    (N'A',N'OK'),
    (N'B',N'OK'),
    (N'B',N'BAD'),
    (N'A',N'OK');


    WITH CTE_PartTypeWithStateChange
    AS
    (
        SELECT   ID
                ,PartType
                ,PartStatus
                ,(
                    CASE
                        WHEN (LAG(PartType, 1, '') OVER (ORDER BY ID) <> PartType) THEN  1
                        ELSE 0
                    END
                    ) HasStateChanged
        FROM  @parts 
    )
    ,
    CTE_PartTypeWithGroupID
    AS
    (
        SELECT   ID
                ,PartType
                ,PartStatus
                ,SUM(HasStateChanged) OVER (ORDER BY ID ROWS UNBOUNDED PRECEDING) AS GroupID
        FROM    CTE_PartTypeWithStateChange
    )
    SELECT   MAX(PartType) AS PartType
            ,COUNT(PartType) AS Quantity
    FROM     CTE_PartTypeWithGroupID
    GROUP BY GroupID

While it is a bit more code this approach does give you the benefit of reducing the number of reads on your source table since you are not performing any self joins. This approach also reduces the number of sorts that the query has to perform which should improve performance on larger data sets.

Thanks a lot for this. Since my table have lots of records, this saves me some time on each request. — Felipe Deguchi, May 03 '16 at 15:17

Raul · Answer 3 · 2016-05-03T12:14:14.127

1

Consider this test Table using your sample input:

DECLARE @test TABLE
(
    ID int IDENTITY(1,1) NOT NULL,
    PartType nvarchar(1) NOT NULL,
    PartStatus nvarchar(50) NOT NULL
)

INSERT INTO @test (PartType,PartStatus)
VALUES 
(N'A',N'OK'),
(N'A',N'BAD'),
(N'A',N'OK'),
(N'A',N'OK'),
(N'B',N'OK'),
(N'B',N'BAD'),
(N'A',N'OK');

I used an apply to get the next ID when the PartType changes:

SELECT t.PartType
, COUNT(t.ID) AS Quantity
FROM @test t
INNER JOIN (
    SELECT MAX(ID) + 1 axID
    FROM @test
) m 
ON 1 = 1
OUTER APPLY (
    SELECT TOP 1 s.ID as extID
    FROM @test s
    WHERE s.ID > t.ID
    AND s.PartType <> t.PartType
    ORDER BY s.ID ASC
) n 
GROUP BY t.PartType, ISNULL(n.extID,m.axID)
ORDER BY ISNULL(n.extID,m.axID)

edited May 03 '16 at 12:14

answered May 03 '16 at 11:56

Raul

2,745
1
23
39

Yeah, I misunderstood the question, my bad. I just figured that he wants an order for his groups. I'll edit the answer, the current one does not meet the requirements. – Raul May 03 '16 at 12:01
Works perfectly. Thanks a lot. – Felipe Deguchi May 03 '16 at 12:16

score 0 · Answer 4 · answered May 03 '16 at 12:32

Try this simple script using recursive CTE,

WITH cte_test as(
    select *,1 as recno from @Table1 where id=1
    union all
    select t.*,(case when c.PartType = t.PartType then recno else recno+1 end ) 
    from @Table1 t inner join cte_test c on t.ID =  c.ID+1
)
select PartType,count(*) from cte_test 
group by recno,PartType
order by recno
option (maxrecursion 0)

MSSQL Group until type is changed

4 Answers4