how to group data based on its sequence and group by other columns

Question

I have a table with 3 columns c1,c2,c3 in Oracle like below:

I need to group the col1 and get the min and max number (of col1) based on its sequence, col2 and col3.

i.e., I need the result as below:

c1_min  c1_max  c2  c3
     1       3  34   2
     4       5  24   2
     6       7  34   2
     8       8  34   1

score 3 · Answer 1 · edited May 23 '17 at 11:49

There are a number of ways to approach a gaps-and-islands problem. As an alternative to Sylvain's lag version - not better, just different - you can use a trick with row numbers calculated analytically based on your grouping fields. This adds a 'chain' psuedocolumn to the table values, which will be unique for each contiguous group of c2/c3 pairs:

select c1, c2, c3,
  dense_rank() over (partition by c2, c3 order by c1)
    - dense_rank() over (partition by null order by c1) as chain
from t42
order by c1, c2, c3;

(I can't take credit for this - I first saw it here). You can then use that as an inline view to calculate your sum:

select min(c1) as c1_min, max(c1) as c1_max, c2, c3
from (
  select c1, c2, c3,
    dense_rank() over (partition by c2, c3 order by c1)
      - dense_rank() over (partition by null order by c1) as chain
  from t42
)
group by c2, c3, chain
order by c1_min;

    C1_MIN     C1_MAX         C2         C3
---------- ---------- ---------- ----------
         1          3         34          2 
         4          5         24          2 
         6          7         34          2 
         8          8         34          1

SQL Fiddle showing the intermediate stage too.

You can use other analytic functions like row_number() instead of dense_rank(); they may give slightly different results for some data, but you get the same result with this sample.

This answer solved my similar problem for SQL Server 2008 R2, which doesn't support the `LAG` function. — dpw, Feb 10 '15 at 00:11

Sylvain Leroux · Accepted Answer · 2014-11-08T18:50:24.980

If I understand it well, you want to group consecutive rows together. This is far from being trivial. Or at least, I can't find right now a simple way of doing it. For ease of understanding, I will break the query in several steps:

Step 1:

The first thing is to identify your "groups" boundaries. Using the LAG analytic function might help you here:

CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2" 
      AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3" 
     THEN 0 
     ELSE 1
END CLK,
T.* FROM T
ORDER BY "c1"

Step 2:

The second step, is to number each of your groups. A simple SUM over partition will do the trick. That leads to:

SELECT SUM(CLK) OVER (ORDER BY "c1"
                      ROWS BETWEEN UNBOUNDED PRECEDING 
                      AND CURRENT ROW) GRP,
       V.* 
FROM (
  SELECT
    CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2" 
          AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3" 
         THEN 0 
         ELSE 1
    END CLK,
    T.* FROM T
) V
ORDER BY "c1";

Final step:

Finally, you can wrap that in a simple GROUP BY query to obtain the desired output:

SELECT MIN("c1"), MAX("c1"), "c2", "c3" FROM
(
    SELECT SUM(CLK) OVER (ORDER BY "c1"
                          ROWS BETWEEN UNBOUNDED PRECEDING 
                          AND CURRENT ROW) GRP,
           V.* 
    FROM (
      SELECT
        CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2"
              AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3"
             THEN 0 
             ELSE 1
        END CLK,
        T.* FROM T
    ) V
)
GROUP BY GRP, "c2", "c3"
ORDER BY GRP

See http://sqlfiddle.com/#!4/7d57c/10

i am using toad for running queries, when i run the above query. it gives me an error saying ORA-00904: "c3": invalid identifier' — Mars, Nov 10 '14 at 06:49
@Mars - Sylvain has quoted your identifiers, I guess because you showed them as lower-case in the query results. If they are not really quoted then just remove all of the double-quotes from the query. — Alex Poole, Nov 10 '14 at 08:57

how to group data based on its sequence and group by other columns

2 Answers2

Step 1:

Step 2:

Final step: