cumulative average in MySQL

Question

I have a table with id and values shown below. is it possible to get another column which takes the value divided by the cumulative average as we go down the row?

original table : t1
    +----+----------------------+
    | id | Val                  |               
    +----+---------------------+-
    |  1 |                 NULL |  
    |  2 |                  136 |    
    |  3 |                   42 |


 table i want to get 
+----+---------------------+-----------------------------+
| id | Val                  | VAL/(AVG(VAL) )            |              
+----+---------------------+-----------------------------+
|  1 |                 NULL |                    NULL    | 
|  2 |                  136 |    136/((136+0)/2)=2.000   | 
|  3 |                   42 |  42/((42+136+0)/3)=0.708   |

here is my query:

SELECT  t1.id, t1.Val, Val/AVG(t1.Val) 
FROM    followers t1
JOIN    followers t2
ON      t2.id <= t1.id
group by t1.id;

however i get this instead:

+----+---------------------+----------------------+
| id | Val                 | VAL/(AVG(VAL) )      |
+----+---------------------+----------------------+
|  1 |                NULL |   NULL               | 
|  2 |                 136 | 1.0000               | 
|  3 |                  42 | 1.0000               |

seems like AVG(Val) returns the same value from the col Val.

I was hoping to do something similar to this link here but instead of sum i want average. MySQL SELECT function to sum current data

I re-implemented the edits and took rows with NULL into account:

+----+---------------------+---------------------+
| id | Val                 | VAL/(AVG(VAL) )      |
+----+---------------------+----------------------+
|  1 |                NULL |    NULL              | 
|  2 |                 136 |  1.0000              |<---need this to = 2.000 
|  3 |                  42 |  0.4719              |<---need this to = 0.708 

SELECT  t1.id, t1.Val, t1.Val/(SUM(t2.Val)/(t1.id)) AS C
FROM    followers t1
JOIN    followers t2
ON      t2.id <= t1.id
group by t1.id;

You're after a running total. Then it's simple math. This is accomplished using variables. Here's an example: http://stackoverflow.com/questions/664700/calculate-a-running-total-in-mysql or http://stackoverflow.com/questions/8359077/mysql-running-total-with-count — xQbert, May 20 '14 at 22:30
Why not just calculate in application layer? Doing this in DB layer is really asking the DB to perform logic it is not optimized for anyway. — Mike Brant, May 20 '14 at 22:35
@Strawberry Well joining a table to itself in what amounts to a partial Cartesian join as proposed in answer from Gordon Linoff below. Or doing some other procedure based approach with MySQL variables may not be as efficient as just querying the individual rows and calculating the running totals in your application. — Mike Brant, May 20 '14 at 22:41

Gordon Linoff · Accepted Answer · 2014-05-20T23:07:21.523

0

I think you want t2.val in the avg():

SELECT  t1.id, t1.Val, t1.Val/AVG(t2.Val) 
FROM    followers t1 JOIN
        followers t2
        ON t2.id <= t1.id
group by t1.id;

EDIT:

Mike Brand is correct that the above is a lousy way to do what you want. In MySQL, you can do the same using variables:

select t.id, t.val,
       (case when (@n := @n + 1) is null then null
             when (@cumval := @cumval + val) is null then null
             else t.val / (@cumval / @n)
        end)
from followers t cross join
     (select @n := 0, @cumval := 0) vars
order by t.id;

This might misbehave with NULL values of val, but it gives the idea for a faster way to do the calculation in MySQL.

edited May 20 '14 at 23:07

answered May 20 '14 at 22:33

Gordon Linoff

1,242,037
58
646
786

While this should work, it would be horrendously slow on DB tables of significant size. You are in essence going to create n * ((n-1)/2) rows. Fine if your DB is 10 rows, which results in 45 rows after join. But if your DB is 1M rows the resulting join would be almost 500 billion rows. – Mike Brant May 20 '14 at 22:47
@GordonLinoff the original query worked. the one with variables returned NULL for the row i wanted. – jxn May 20 '14 at 23:16

cumulative average in MySQL

1 Answers1