Create a Cumulative Sum Column in MySQL

Question

I have a table that looks like this:

I want to add a new column called cumulative_sum, so the table would look like this:

id   count  cumulative_sum
1    100    100
2    50     150
3    10     160

Is there a MySQL update statement that can do this easily? What's the best way to accomplish this?

OMG Ponies · Answer 1 · 2010-04-28T14:41:40.883

116

Using a correlated query:

  SELECT t.id,
         t.count,
         (SELECT SUM(x.count)
            FROM TABLE x
           WHERE x.id <= t.id) AS cumulative_sum
    FROM TABLE t
ORDER BY t.id

Using MySQL variables:

  SELECT t.id,
         t.count,
         @running_total := @running_total + t.count AS cumulative_sum
    FROM TABLE t
    JOIN (SELECT @running_total := 0) r
ORDER BY t.id

Note:

The JOIN (SELECT @running_total := 0) r is a cross join, and allows for variable declaration without requiring a separate SET command.
The table alias, r, is required by MySQL for any subquery/derived table/inline view

Caveats:

MySQL specific; not portable to other databases
The ORDER BY is important; it ensures the order matches the OP and can have larger implications for more complicated variable usage (IE: psuedo ROW_NUMBER/RANK functionality, which MySQL lacks)

edited Apr 28 '10 at 14:41

answered Apr 01 '10 at 21:54

OMG Ponies

325,700
82
523
502

I would add "ORDER BY t.id ASC" to the main query, just to make sure it'll always work – Wacek Apr 01 '10 at 22:20
My first thought also was to add ORDER BY. But it does not matter. Until addition turns into non-associative, at least :) – Dercsár Apr 02 '10 at 07:24
@OMG Poines: I think you need to use a `SELECT` in the `JOIN (SELECT @running_total := 0)` part of the variables example. – Daniel Vassallo Apr 28 '10 at 06:48
2

for "using a correlated query" where does your table x come from ? – allan.simon Sep 12 '16 at 15:40
Unless there is optimization happening internally, the correlated subquery is the equivalent of a triangular join performing in O(N^2) time--which will not scale. – Marc L. Dec 02 '16 at 16:18
@MarcL. may you please explain what you mean? – Meglio Dec 15 '16 at 03:15
For every row within the table, a full query of the same table is being performed. The number of rows in the subquery increases linearly, up to the full count of the table (maybe -1). The total scope of this forms a big "triangle". Because of the repetition involved (it doesn't add to the prior result iteratively, but instead recreates for every row) this scales along with the area of the triangle. Scale doesn't respect linear multipliers, so the (N^2)/2 is just O(N^2). – Marc L. Dec 27 '16 at 21:06
Doesn't seem to be working, @running_total seems to be 0 for every row. Could that be a version thing? (running MariaDB) – Ivan Peevski Jun 21 '17 at 22:01
if you want your comulative count from a table, then table x is the same as table t (*you count the total number of elements previous to the current, so you iterate the source table per each row in the source table*) – Evhz Jan 21 '21 at 15:08

score 98 · Accepted Answer · answered Apr 01 '10 at 22:08

98

If performance is an issue, you could use a MySQL variable:

set @csum := 0;
update YourTable
set cumulative_sum = (@csum := @csum + count)
order by id;

Alternatively, you could remove the cumulative_sum column and calculate it on each query:

set @csum := 0;
select id, count, (@csum := @csum + count) as cumulative_sum
from YourTable
order by id;

This calculates the running sum in a running way :)

answered Apr 01 '10 at 22:08

Andomar

232,371
49
380
404

7

Use a cross join to define the variable without needing to use `SET`. – OMG Ponies Apr 01 '10 at 22:12
My table has 36 million records, so this was really helpful to speed things up! – Kirk Ouimet Apr 02 '10 at 01:20
Note that ordering by cumulative_sum might force full table scan. – matt Sep 25 '12 at 09:23
1

This does work and seems quite fast; any suggestions how this can be extended to do a cumulative sum in a group? e.g. group by `Name` or similar, and then do a cumulative sum only for records with the same name – zaitsman Aug 31 '17 at 12:36
@zaitsman You can use it as a subquery; on the outer query, group by anything you want, and then use the MAX() MySQL function to get the correct cumulative summary (the last summary) that was calculated for the records inside the group. – Pascal Jul 12 '18 at 06:51
This does not seem to work with an `order` clause that differs from the order in which the records are encountered by MySQL. It seems that the cumulative sum is computed in the original order and then scrambled when the ordering is done, making it no longer "cumulative" in the desired sense. – Brick Sep 04 '18 at 14:32
1

Prefer answer of OLAP function in MySQL 8.0+, as stated in https://stackoverflow.com/a/52278657/3090068 – Yuki Inoue Sep 19 '18 at 07:03

score 40 · Answer 3 · answered Sep 11 '18 at 14:49

40

MySQL 8.0/MariaDB supports windowed SUM(col) OVER():

SELECT *, SUM(cnt) OVER(ORDER BY id) AS cumulative_sum
FROM tab;

Output:

┌─────┬──────┬────────────────┐
│ id  │ cnt  │ cumulative_sum │
├─────┼──────┼────────────────┤
│  1  │ 100  │            100 │
│  2  │  50  │            150 │
│  3  │  10  │            160 │
└─────┴──────┴────────────────┘

db<>fiddle

answered Sep 11 '18 at 14:49

Lukasz Szozda

162,964
23
234
275

2

I am looking for Cumulative sum using windows function.Thank you. – DatabaseCoder Aug 30 '19 at 06:13
@lukasz szozda, how would you insert this data into a database table column so it can be used in other tables? Thanks – kejo Sep 21 '22 at 14:36
@kejo `INSERT INTO table_name(id, cnt, cumulative_sum) SELECT ... FROM ...` or `CREATE TABLE table_name AS SELECT ... FROM ...` – Lukasz Szozda Sep 21 '22 at 14:37

score 3 · Answer 4 · edited Feb 13 '17 at 14:41

3

select Id, Count, @total := @total + Count as cumulative_sum
from YourTable, (Select @total := 0) as total ;

edited Feb 13 '17 at 14:41

raisercostin

8,777
5
67
76

answered Jul 10 '14 at 10:06

Ashutosh SIngh

931
3
13
26

4

Please explain your answer – Rohit Gupta Oct 27 '15 at 00:26
The answer works and is one liner. It also initializes/resets the variable to zero at the begining of select. – raisercostin Feb 13 '17 at 14:41

score 3 · Answer 5 · answered Apr 01 '10 at 21:59

3

UPDATE t
SET cumulative_sum = (
 SELECT SUM(x.count)
 FROM t x
 WHERE x.id <= t.id
)

answered Apr 01 '10 at 21:59

Dercsár

1,636
2
14
26

3

Although the OP did ask for an update, this is denormalized and will probably be inconvenient to maintain correctly. – Matthew Flaschen Apr 01 '10 at 22:05

Greg · Answer 6 · 2016-07-08T07:59:46.773

2

You could also create a trigger that will calculate the sum before each insert

delimiter |

CREATE TRIGGER calCumluativeSum  BEFORE INSERT ON someTable
  FOR EACH ROW BEGIN

  SET cumulative_sum = (
     SELECT SUM(x.count)
        FROM someTable x
        WHERE x.id <= NEW.id
    )

    set  NEW.cumulative_sum = cumulative_sum;
  END;
|

I have not tested this

edited Jul 08 '16 at 07:59

answered Apr 01 '10 at 22:05

Greg

1,671
2
15
30

score 2 · Answer 7 · edited Jul 04 '11 at 12:04

2

Sample query

SET @runtot:=0;
SELECT
   q1.d,
   q1.c,
   (@runtot := @runtot + q1.c) AS rt
FROM
   (SELECT
       DAYOFYEAR(date) AS d,
       COUNT(*) AS c
    FROM  orders
    WHERE  hasPaid > 0
    GROUP  BY d
    ORDER  BY d) AS q1

edited Jul 04 '11 at 12:04

Bjarki Heiðar

3,117
6
27
40

answered Jul 04 '11 at 11:03

Jazz

31
2

Pavan Bashetty · Answer 8 · 2019-02-24T04:24:49.500

1

select id,count,sum(count)over(order by count desc) as cumulative_sum from tableName;

I have used the sum aggregate function on the count column and then used the over clause. It sums up each one of the rows individually. The first row is just going to be 100. The second row is going to be 100+50. The third row is 100+50+10 and so forth. So basically every row is the sum of it and all the previous rows and the very last one is the sum of all the rows. So the way to look at this is each row is the sum of the amount where the ID is less than or equal to itself.

edited Feb 24 '19 at 04:24

answered Feb 22 '19 at 01:32

Pavan Bashetty

11
3

2

While this might solve the problem, it's better to explain it a bit so it will benefit others :) – Til Feb 22 '19 at 02:40
this isn't a co-related subquery or a subquery for that matter... co-related subquery follows `SELECT ...., (SELECT .... FROM table2 WHERE table2.id = table1.id ) FROM table1` what you have is a window query.. – Raymond Nijland Feb 23 '19 at 23:47

Flavio_cava · Answer 9 · 2020-05-20T14:12:51.297

  select t1.id, t1.count, SUM(t2.count) cumulative_sum
    from table t1 
        join table t2 on t1.id >= t2.id
    group by t1.id, t1.count

Step by step:

1- Given the following table:

select *
from table t1 
order by t1.id;

id  | count
 1  |  11
 2  |  12   
 3  |  13

2 - Get information by groups

select *
from table t1 
    join table t2 on t1.id >= t2.id
order by t1.id, t2.id;

id  | count | id | count
 1  | 11    | 1  |  11

 2  | 12    | 1  |  11
 2  | 12    | 2  |  12

 3  | 13    | 1  |  11
 3  | 13    | 2  |  12
 3  | 13    | 3  |  13

3- Step 3: Sum all count by t1.id group

select t1.id, t1.count, SUM(t2.count) cumulative_sum
from table t1 
    join table t2 on t1.id >= t2.id
group by t1.id, t1.count;


id  | count | cumulative_sum
 1  |  11   |    11
 2  |  12   |    23
 3  |  13   |    36

Added some step by step to understand the final query – Flavio_cava May 20 '20 at 14:13 — Flavio_cava, May 20 '20 at 14:13

Create a Cumulative Sum Column in MySQL

9 Answers9

Using a correlated query:

Using MySQL variables:

Linked

Related