How to group continuous ranges using MySQL

Question

I have a table that contains categories, dates and rates. Each category can have different rates for different dates, one category can have only one rate at a given date.

Id        CatId    Date        Rate 
------  ------   ------------   ---------
000001      12   2009-07-07     1
000002      12   2009-07-08     1
000003      12   2009-07-09     1
000004      12   2009-07-10     2
000005      12   2009-07-15     1
000006      12   2009-07-16     1
000007      13   2009-07-08     1
000008      13   2009-07-09     1
000009      14   2009-07-07     2
000010      14   2009-07-08     1
000010      14   2009-07-10     1

Unique index (catid, Date, Rate) I would like for each category to group all continuous dates ranges and keep only the begin and the end of the range. For the previous example, we would have:

CatId    Begin          End            Rate 
------   ------------   ------------   ---------
12        2009-07-07    2009-07-09     1
12        2009-07-10    2009-07-10     2
12        2009-07-15    2009-07-16     1  
13        2009-07-08    2009-07-09     1  
14        2009-07-07    2009-07-07     2
14        2009-07-08    2009-07-08     1
14        2009-07-10    2009-07-10     1

I found a similar solution in the forum which did not exactly give the result

WITH    q AS
        (
        SELECT  *,
                ROW_NUMBER() OVER (PARTITION BY CatId, Rate ORDER BY [Date]) AS rnd,
                ROW_NUMBER() OVER (PARTITION BY CatId ORDER BY [Date]) AS rn
        FROM    my_table
        )
SELECT  CatId AS catidd, MIN([Date]) as beginn, MAX([Date])as endd, Rate
FROM    q
GROUP BY  CatId, rnd - rn, Rate

SEE SQL FIDDLE How can I do the same thing in mysql? Please help!

Why does your example show for `(CatId,Rate)=(14,1)` a resulting range from `2009-07-08` to `2009-07-10` when there is no `2009-07-09` in the underlying table? c.f. `(CatId,Rate)=(12,1)`, which produces two resulting ranges due to its discontinuity. — eggyal, Sep 05 '12 at 07:37

score 6 · Accepted Answer · answered Sep 05 '12 at 07:46

6

MySQL doesn't support analytic functions, but you can emulate such behaviour with user-defined variables:

SELECT   CatID, Begin, MAX(Date) AS End, Rate
FROM (
  SELECT   my_table.*,
           @f:=CONVERT(
             IF(@c<=>CatId AND @r<=>Rate AND DATEDIFF(Date, @d)=1, @f, Date), DATE
           ) AS Begin,
           @c:=CatId, @d:=Date, @r:=Rate
  FROM     my_table JOIN (SELECT @c:=NULL) AS init
  ORDER BY CatId, Rate, Date
) AS t
GROUP BY CatID, Begin, Rate

See it on sqlfiddle.

answered Sep 05 '12 at 07:46

eggyal

122,705
18
212
237

1

@vanabel: It is MySQL's [NULL-safe equal to operator](http://dev.mysql.com/doc/en/comparison-operators.html#operator_equal-to). – eggyal Jan 23 '15 at 12:30
@eggyal Nice answer, upvote. Can you please explain in more detail the trick you used to set the value of variable `@f` (joining with `(SELECT @c:=NULL)`)? – Miljen Mikic Dec 21 '17 at 13:46
@MiljenMikic: in this case, the `IF()` will always take the false branch on the first record (due to `CatID` being non-null and therefore different from the `NULL`-initialised value of `@c`). Consequently, the initial value of `@f` is ignored/overwritten before being read—hence initialisation is unnecessary. – eggyal Dec 21 '17 at 14:18
This SQL's **first running in a session** returns all 11 rows, and returns 7 rows after that. This happens on both MySQL 5.6 and 5.7. The above sqlfiddle always returns 7 rows as expected. Anyone know why? – bopomofu Jan 10 '18 at 03:08

score 4 · Answer 2 · edited Jan 07 '13 at 16:16

4

SELECT catid,min(ddate),max(ddate),rate
FROM (
    SELECT
        Catid,
        Ddate,  
        rate,
        @rn := CASE WHEN (@prev <> rate 
           or DATEDIFF(ddate, @prev_date)>1) THEN @rn+1 ELSE @rn END AS rn,
        @prev := rate,
        @prev_id := catid ,
        @prev_date :=ddate
    FROM (
        SELECT CatID,Ddate,rate 
        FROM rankdate
        ORDER BY CatID, Ddate ) AS a , 
        (SELECT @prev := -1, @rn := 0, @prev_id:=0 ,@prev_date:=-1) AS vars      

) T1 group by catid,rn

Note: The line (SELECT @prev := -1, @rn := 0, @prev_id:=0 ,@prev_date:=-1) AS vars is not necessary in Mysql Workspace, but it is in the PHP mysql_query function.

SQL FIDDLE HERE

edited Jan 07 '13 at 16:16

aZen

223
1
8

answered Sep 05 '12 at 07:18

sel

4,982
1
16
22

if we remove the record with ID = '000004' your query return (begin:2009-07-07, end:2009-07-16, rate:1) that's not correct because there is a gap, it should return (begin:2009-07-07, end:2009-07-09, rate:1) and (begin:2009-07-15, end:2009-07-16, rate:1). [SQL FIDDLE HERE](http://sqlfiddle.com/#!2/513b2/1) – Fouzi Sep 05 '12 at 07:52
@BoussahelBachir, i have edited the answer. in this case, need to include the condition of datediff to cater the situation you mention. – sel Sep 05 '12 at 08:10
You don't seem to be testing `@prev_id` anywhere... what happens if two contiguous dates with the same `Rate` have a different `CatId`? – eggyal Sep 05 '12 at 08:20
Due to i have group by catid, so it will group by catid follow by rn somehow.Thus i do not use `@prev_id` at last. Initially , i did think of to include `@prev_id` in the condition thou. – sel Sep 05 '12 at 08:30

score 0 · Answer 3 · answered May 26 '16 at 12:47

I know I am late, still posting a solution that worked for me. Had the same issue, here's how I got it

Found a good solution using variables

SELECT  MIN(id) AS id, MIN(date) AS date, MIN(state) AS state, COUNT(*) cnt
FROM    (
    SELECT  @r := @r + (@state != state OR @state IS NULL) AS gn,
            @state := state AS sn,
            s.id, s.date, s.state
    FROM    (
            SELECT  @r := 0,
                    @state := NULL
            ) vars,
            t_range s
    ORDER BY
            date, state
    ) q
GROUP BY gn

More details at : https://explainextended.com/2009/07/24/mysql-grouping-continuous-ranges/

How to group continuous ranges using MySQL

3 Answers3

Linked