1

I have a table that looks like this

ID, Order, Segment
1, 1, A
1, 2, B
1, 3, B
1, 4, C
1, 5, B
1, 6, B
1, 7, B
1, 8, B

Basically by ordering the data using the Order column. I would like to understand the number of consecutive B's for each of the ID's. Ideally the output I would like is

ID, Consec
1, 2
1, 4

Because the segment B appears consecutively in row 2 and 3 (2 times), and then again in row 5,6,7,8 (4 times).

I can't think of a solution in SQL since there is no loop facility in SQL.

Are there elegant solutions in Teradata SQL?

P.S. The data I am dealing with has ~20 million rows.

The way to do it in R has been published here.

How to count the number of times an element appears consecutively in a data.table?

Community
  • 1
  • 1
xiaodai
  • 14,889
  • 18
  • 76
  • 140

1 Answers1

4

It is easy to do with analytic functions. While I don't know anything about teradata, quickly googling makes it appear as though it does support analytic functions.

In any case, I've tested the following in Oracle --

  select id,
         count(*)
    from (select x.*,
                 row_number() over(partition by id order by ord) -
                 row_number() over(partition by id, seg order by ord) as grp
            from tbl x) x
   where seg = 'B'
group by id, grp
order by grp

The trick is establishing the 'groups' of Bs.

Fiddle: http://sqlfiddle.com/#!4/4ed6c/2/0

Brian DeMilia
  • 13,103
  • 1
  • 23
  • 33
  • 1
    If `ord` is actually an unbroken unique sequence, you could just do `ord - row_number() over(partition by id, seg order by ord) as grp` Also, I believe you can safely add the `seg` column to the final output by appending it to the list of columns to `group by`. – Paul Griffin Apr 16 '15 at 01:45
  • or just row_number() - row_number() over(partition by id, seg order by ord) as grp – xiaodai Apr 16 '15 at 02:27