0

I have a table with 3 columns c1,c2,c3 in Oracle like below:

c1  c2  c3
 1  34   2
 2  34   2
 3  34   2
 4  24   2
 5  24   2
 6  34   2
 7  34   2
 8  34   1

I need to group the col1 and get the min and max number (of col1) based on its sequence, col2 and col3.

i.e., I need the result as below:

c1_min  c1_max  c2  c3
     1       3  34   2
     4       5  24   2
     6       7  34   2
     8       8  34   1
Alex Poole
  • 183,384
  • 11
  • 179
  • 318
Mars
  • 13
  • 2

2 Answers2

3

There are a number of ways to approach a gaps-and-islands problem. As an alternative to Sylvain's lag version - not better, just different - you can use a trick with row numbers calculated analytically based on your grouping fields. This adds a 'chain' psuedocolumn to the table values, which will be unique for each contiguous group of c2/c3 pairs:

select c1, c2, c3,
  dense_rank() over (partition by c2, c3 order by c1)
    - dense_rank() over (partition by null order by c1) as chain
from t42
order by c1, c2, c3;

(I can't take credit for this - I first saw it here). You can then use that as an inline view to calculate your sum:

select min(c1) as c1_min, max(c1) as c1_max, c2, c3
from (
  select c1, c2, c3,
    dense_rank() over (partition by c2, c3 order by c1)
      - dense_rank() over (partition by null order by c1) as chain
  from t42
)
group by c2, c3, chain
order by c1_min;

    C1_MIN     C1_MAX         C2         C3
---------- ---------- ---------- ----------
         1          3         34          2 
         4          5         24          2 
         6          7         34          2 
         8          8         34          1 

SQL Fiddle showing the intermediate stage too.

You can use other analytic functions like row_number() instead of dense_rank(); they may give slightly different results for some data, but you get the same result with this sample.

Community
  • 1
  • 1
Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • This answer solved my similar problem for SQL Server 2008 R2, which doesn't support the `LAG` function. – dpw Feb 10 '15 at 00:11
2

If I understand it well, you want to group consecutive rows together. This is far from being trivial. Or at least, I can't find right now a simple way of doing it. For ease of understanding, I will break the query in several steps:

Step 1:

The first thing is to identify your "groups" boundaries. Using the LAG analytic function might help you here:

CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2" 
      AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3" 
     THEN 0 
     ELSE 1
END CLK,
T.* FROM T
ORDER BY "c1"

Step 2:

The second step, is to number each of your groups. A simple SUM over partition will do the trick. That leads to:

SELECT SUM(CLK) OVER (ORDER BY "c1"
                      ROWS BETWEEN UNBOUNDED PRECEDING 
                      AND CURRENT ROW) GRP,
       V.* 
FROM (
  SELECT
    CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2" 
          AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3" 
         THEN 0 
         ELSE 1
    END CLK,
    T.* FROM T
) V
ORDER BY "c1";

Final step:

Finally, you can wrap that in a simple GROUP BY query to obtain the desired output:

SELECT MIN("c1"), MAX("c1"), "c2", "c3" FROM
(
    SELECT SUM(CLK) OVER (ORDER BY "c1"
                          ROWS BETWEEN UNBOUNDED PRECEDING 
                          AND CURRENT ROW) GRP,
           V.* 
    FROM (
      SELECT
        CASE WHEN LAG("c2", 1) OVER(ORDER BY "c1") = "c2"
              AND LAG("c3", 1) OVER(ORDER BY "c1") = "c3"
             THEN 0 
             ELSE 1
        END CLK,
        T.* FROM T
    ) V
)
GROUP BY GRP, "c2", "c3"
ORDER BY GRP

See http://sqlfiddle.com/#!4/7d57c/10

Sylvain Leroux
  • 50,096
  • 7
  • 103
  • 125
  • i am using toad for running queries, when i run the above query. it gives me an error saying ORA-00904: "c3": invalid identifier' – Mars Nov 10 '14 at 06:49
  • 1
    @Mars - Sylvain has quoted your identifiers, I guess because you showed them as lower-case in the query results. If they are not really quoted then just remove all of the double-quotes from the query. – Alex Poole Nov 10 '14 at 08:57