85

I have a table that looks like this:

id   count
1    100
2    50
3    10

I want to add a new column called cumulative_sum, so the table would look like this:

id   count  cumulative_sum
1    100    100
2    50     150
3    10     160

Is there a MySQL update statement that can do this easily? What's the best way to accomplish this?

Bulat
  • 6,869
  • 1
  • 29
  • 52
Kirk Ouimet
  • 27,280
  • 43
  • 127
  • 177

9 Answers9

116

Using a correlated query:


  SELECT t.id,
         t.count,
         (SELECT SUM(x.count)
            FROM TABLE x
           WHERE x.id <= t.id) AS cumulative_sum
    FROM TABLE t
ORDER BY t.id

Using MySQL variables:


  SELECT t.id,
         t.count,
         @running_total := @running_total + t.count AS cumulative_sum
    FROM TABLE t
    JOIN (SELECT @running_total := 0) r
ORDER BY t.id

Note:

  • The JOIN (SELECT @running_total := 0) r is a cross join, and allows for variable declaration without requiring a separate SET command.
  • The table alias, r, is required by MySQL for any subquery/derived table/inline view

Caveats:

  • MySQL specific; not portable to other databases
  • The ORDER BY is important; it ensures the order matches the OP and can have larger implications for more complicated variable usage (IE: psuedo ROW_NUMBER/RANK functionality, which MySQL lacks)
OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
  • I would add "ORDER BY t.id ASC" to the main query, just to make sure it'll always work – Wacek Apr 01 '10 at 22:20
  • My first thought also was to add ORDER BY. But it does not matter. Until addition turns into non-associative, at least :) – Dercsár Apr 02 '10 at 07:24
  • @OMG Poines: I think you need to use a `SELECT` in the `JOIN (SELECT @running_total := 0)` part of the variables example. – Daniel Vassallo Apr 28 '10 at 06:48
  • 2
    for "using a correlated query" where does your table x come from ? – allan.simon Sep 12 '16 at 15:40
  • Unless there is optimization happening internally, the correlated subquery is the equivalent of a triangular join performing in O(N^2) time--which will not scale. – Marc L. Dec 02 '16 at 16:18
  • @MarcL. may you please explain what you mean? – Meglio Dec 15 '16 at 03:15
  • For every row within the table, a full query of the same table is being performed. The number of rows in the subquery increases linearly, up to the full count of the table (maybe -1). The total scope of this forms a big "triangle". Because of the repetition involved (it doesn't add to the prior result iteratively, but instead recreates for every row) this scales along with the area of the triangle. Scale doesn't respect linear multipliers, so the (N^2)/2 is just O(N^2). – Marc L. Dec 27 '16 at 21:06
  • Doesn't seem to be working, @running_total seems to be 0 for every row. Could that be a version thing? (running MariaDB) – Ivan Peevski Jun 21 '17 at 22:01
  • if you want your comulative count from a table, then table x is the same as table t (*you count the total number of elements previous to the current, so you iterate the source table per each row in the source table*) – Evhz Jan 21 '21 at 15:08
98

If performance is an issue, you could use a MySQL variable:

set @csum := 0;
update YourTable
set cumulative_sum = (@csum := @csum + count)
order by id;

Alternatively, you could remove the cumulative_sum column and calculate it on each query:

set @csum := 0;
select id, count, (@csum := @csum + count) as cumulative_sum
from YourTable
order by id;

This calculates the running sum in a running way :)

Andomar
  • 232,371
  • 49
  • 380
  • 404
  • 7
    Use a cross join to define the variable without needing to use `SET`. – OMG Ponies Apr 01 '10 at 22:12
  • My table has 36 million records, so this was really helpful to speed things up! – Kirk Ouimet Apr 02 '10 at 01:20
  • Note that ordering by cumulative_sum might force full table scan. – matt Sep 25 '12 at 09:23
  • 1
    This does work and seems quite fast; any suggestions how this can be extended to do a cumulative sum in a group? e.g. group by `Name` or similar, and then do a cumulative sum only for records with the same name – zaitsman Aug 31 '17 at 12:36
  • @zaitsman You can use it as a subquery; on the outer query, group by anything you want, and then use the MAX() MySQL function to get the correct cumulative summary (the last summary) that was calculated for the records inside the group. – Pascal Jul 12 '18 at 06:51
  • This does not seem to work with an `order` clause that differs from the order in which the records are encountered by MySQL. It seems that the cumulative sum is computed in the original order and then scrambled when the ordering is done, making it no longer "cumulative" in the desired sense. – Brick Sep 04 '18 at 14:32
  • 1
    Prefer answer of OLAP function in MySQL 8.0+, as stated in https://stackoverflow.com/a/52278657/3090068 – Yuki Inoue Sep 19 '18 at 07:03
40

MySQL 8.0/MariaDB supports windowed SUM(col) OVER():

SELECT *, SUM(cnt) OVER(ORDER BY id) AS cumulative_sum
FROM tab;

Output:

┌─────┬──────┬────────────────┐
│ id  │ cnt  │ cumulative_sum │
├─────┼──────┼────────────────┤
│  1  │ 100  │            100 │
│  2  │  50  │            150 │
│  3  │  10  │            160 │
└─────┴──────┴────────────────┘

db<>fiddle

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
3
select Id, Count, @total := @total + Count as cumulative_sum
from YourTable, (Select @total := 0) as total ;
raisercostin
  • 8,777
  • 5
  • 67
  • 76
Ashutosh SIngh
  • 931
  • 3
  • 13
  • 26
3
UPDATE t
SET cumulative_sum = (
 SELECT SUM(x.count)
 FROM t x
 WHERE x.id <= t.id
)
Dercsár
  • 1,636
  • 2
  • 14
  • 26
2

You could also create a trigger that will calculate the sum before each insert

delimiter |

CREATE TRIGGER calCumluativeSum  BEFORE INSERT ON someTable
  FOR EACH ROW BEGIN

  SET cumulative_sum = (
     SELECT SUM(x.count)
        FROM someTable x
        WHERE x.id <= NEW.id
    )

    set  NEW.cumulative_sum = cumulative_sum;
  END;
|

I have not tested this

Greg
  • 1,671
  • 2
  • 15
  • 30
2

Sample query

SET @runtot:=0;
SELECT
   q1.d,
   q1.c,
   (@runtot := @runtot + q1.c) AS rt
FROM
   (SELECT
       DAYOFYEAR(date) AS d,
       COUNT(*) AS c
    FROM  orders
    WHERE  hasPaid > 0
    GROUP  BY d
    ORDER  BY d) AS q1
Bjarki Heiðar
  • 3,117
  • 6
  • 27
  • 40
Jazz
  • 31
  • 2
1

select id,count,sum(count)over(order by count desc) as cumulative_sum from tableName;

I have used the sum aggregate function on the count column and then used the over clause. It sums up each one of the rows individually. The first row is just going to be 100. The second row is going to be 100+50. The third row is 100+50+10 and so forth. So basically every row is the sum of it and all the previous rows and the very last one is the sum of all the rows. So the way to look at this is each row is the sum of the amount where the ID is less than or equal to itself.

  • 2
    While this might solve the problem, it's better to explain it a bit so it will benefit others :) – Til Feb 22 '19 at 02:40
  • this isn't a co-related subquery or a subquery for that matter... co-related subquery follows `SELECT ...., (SELECT .... FROM table2 WHERE table2.id = table1.id ) FROM table1` what you have is a window query.. – Raymond Nijland Feb 23 '19 at 23:47
0
  select t1.id, t1.count, SUM(t2.count) cumulative_sum
    from table t1 
        join table t2 on t1.id >= t2.id
    group by t1.id, t1.count

Step by step:

1- Given the following table:

select *
from table t1 
order by t1.id;

id  | count
 1  |  11
 2  |  12   
 3  |  13

2 - Get information by groups

select *
from table t1 
    join table t2 on t1.id >= t2.id
order by t1.id, t2.id;

id  | count | id | count
 1  | 11    | 1  |  11

 2  | 12    | 1  |  11
 2  | 12    | 2  |  12

 3  | 13    | 1  |  11
 3  | 13    | 2  |  12
 3  | 13    | 3  |  13

3- Step 3: Sum all count by t1.id group

select t1.id, t1.count, SUM(t2.count) cumulative_sum
from table t1 
    join table t2 on t1.id >= t2.id
group by t1.id, t1.count;


id  | count | cumulative_sum
 1  |  11   |    11
 2  |  12   |    23
 3  |  13   |    36