1

I'm trying to calculate row differences (like MySQL difference between two rows of a SELECT Statement) over a grouped result set:

create table test (i int not null auto_increment, a int, b int, primary key (i));
insert into test (a,b) value (1,1),(1,2),(2,4),(2,8);

Gives

| a | b
---------
| 1 | 1
| 1 | 2
| 2 | 4
| 2 | 8

This is the simple SQL with group and max(group) result columns:

select 
    data.a,
    max(data.b)
from
    (
        select a, b
        from test
        order by i
    ) as data
group by a
order by a

The obvious result is

| a | max(data.b)
-----------------
| 1 | 2
| 2 | 8

Where I'm failing is when I want to calculate the row-by-row differences on the grouped column:

set @c:=0;
select 
    data.a,
    max(data.b),
    @c:=max(data.b)-@c
from
    (
        select a, b
        from test
        order by i
    ) as data
group by a
order by a

Still gives:

| a | max(data.b) | @c:=max(data.b)-@c
--------------------------------------
| 1 | 2           | 2 (expected 2-0=2)
| 2 | 8           | 8 (expected 8-2=6)

Could anybody highlight why the @c variable is not updating from grouped row to grouped row as expected?

Community
  • 1
  • 1
andig
  • 13,378
  • 13
  • 61
  • 98
  • Your first query is not 'obvious'. In fact, it's wrong, or, more precisely, the result cannot (according to MySQL's documentation) be guaranteed! – Strawberry Jun 08 '14 at 17:02
  • @Strawberry: could you point to the relevant docs? Can't find why this wouldn't work. I've read 9.4 User-Defined Variables but that doesn't seem to apply here (accessing variables in HAVING, GROUP BY, or ORDER BY clause). – andig Jun 08 '14 at 17:14
  • 1
    See https://dev.mysql.com/doc/refman/5.0/en/group-by-extensions.html - although I will concede that this hack does seems to work in every version of MySQL that has supported the construct! – Strawberry Jun 08 '14 at 17:38

2 Answers2

1
SELECT data.a
     , data.b
     , @c := data.b - @c
FROM (
        SELECT a
             , max(b) AS b
        FROM test
        GROUP BY a
     ) AS data
ORDER BY a

Example

potashin
  • 44,205
  • 11
  • 83
  • 107
  • @andig : there is a problem with assigning variables to aggregate function result (this assigning is just ignored). [Related question](http://stackoverflow.com/q/22384672/3444240) – potashin Jun 08 '14 at 17:12
0

The 'documented' solution might look like this...

SELECT x.* 
     , @c := b - @c c
  FROM test x 
  JOIN 
     ( SELECT a,MAX(b) max_b FROM test GROUP BY a ) y 
    ON y.a = x.a 
   AND y.max_b = x.b
  JOIN (SELECT @c:= 0) vals;
Strawberry
  • 33,750
  • 13
  • 40
  • 57
  • I'm still failing to exactly understand what the actual problem is, although your solution is fine. Even setting SET GLOBAL sql_mode = 'ONLY_FULL_GROUP_BY' my statements still execute which they shouldn't according to the comment on the question? – andig Jun 08 '14 at 20:56