1

Using MySQL. I want to get cumulative sum.

This is my table

CREATE TABLE `user_infos`
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
(..)
`created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (`id`) )

And what I want to get is

+-------+-------+----------------+
| month | count | cumulative_sum |
+-------+-------+----------------+
|   01  |  100  | 100            |
|   02  |  101  | 201            |
|   ... |  110  | 311            |
|   12  |  200  | 511            |
+-------+-------+----------------+

but the result is

+-------+-------+----------------+
| month | count | cumulative_sum |
+-------+-------+----------------+
|   01  |  100  | 100            |
|   02  |  101  | 101            |
|   ... |  110  | 110            |
|   12  |  200  | 200            |
+-------+-------+----------------+

This is my wrong query..

select 
     T1.Month,T1.Count,
     @runnung_total := (@running_total + T1.Count) as cumulative_sum
from (
     select date_format(created_at,'%m') as Month,count(1) as Count from users 
          where date_format(created_at,'%Y')='2016'
          group by(date_format(created_at,'%m'))
     union
     select date_format(created_at,'%m') as Month,count(1) as Count from users 
          where date_format(created_at,'%Y')='2017'
          group by(date_format(created_at,'%m')) ) as T1
join (select @running_total := 0) as R1;

I referred to this. What's wrong in my code?

Community
  • 1
  • 1
Centell
  • 389
  • 3
  • 18

3 Answers3

2

You can achieve that in two steps: first of all get the sum for each year and month

select  concat(year(created_at), lpad(month(created_at), 2, '0')) as ye_mo,
        count(*) as cnt
from    users
group by concat(year(created_at), lpad(month(created_at), 2, '0'))

Then join it with itself, having each row matched with all previous ones

select  t1.ye_mo, sum(t2.cnt)
from    (
            select  concat(year(created_at), lpad(month(created_at), 2, '0')) as ye_mo,
                    count(*) as cnt
            from    users
            group by concat(year(created_at), lpad(month(created_at), 2, '0'))
        ) t1
join    (
            select  concat(year(created_at), lpad(month(created_at), 2, '0')) as ye_mo,
                    count(*) as cnt
            from    users
            group by concat(year(created_at), lpad(month(created_at), 2, '0'))
        ) t2
on      t1.ye_mo >= t2.ye_mo
group by t1.ye_mo
order by t1.ye_mo

Edit

The query above assumes you want the running sum to increase across different years. If you want to display the months only, and aggregate the values of different years in the same month, you can change id this way

select  t1.mnt, sum(t2.cnt)
from    (
            select  month(created_at) as mnt,
                    count(*) as cnt
            from    userss
            group by month(created_at)
        ) t1
join    (
            select  month(created_at) as mnt,
                    count(*) as cnt
            from    userss
            group by month(created_at)
        ) t2
on      t1.mnt >= t2.mnt
group by t1.mnt
order by t1.mnt

Finally, if you want the running sum to reset at the beginning of each year, you can do that like this

select  t1.yr, t1.mn, sum(t2.cnt)
from    (
            select  year(created_at) as yr, month(created_at) as mn,
                    count(*) as cnt
            from    userss
            group by year(created_at), month(created_at)
        ) t1
join    (
            select  year(created_at) as yr, month(created_at) as mn,
                    count(*) as cnt
            from    userss
            group by year(created_at), month(created_at)
        ) t2
on      t1.yr = t2.yr and
        t1.mn >= t2.mn
group by t1.yr, t1.mn
order by t1.yr, t1.mn

All three versions can be seen in action here

Stefano Zanini
  • 5,876
  • 2
  • 13
  • 33
  • I do not understand one thing that 'year_month' occur errer in my Toad for SQL. " MySQL Database Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'year_month, count(*) as cnt from users group by concat(year(create' at line 1 " If i change 'year_month' like 'yearMonth' then it dissapear. I don't understand why. – Centell May 18 '17 at 02:19
  • That might be because that's a reserved keyword. I updated my first option with a different alias, try it again please. – Stefano Zanini May 18 '17 at 07:23
  • wow. I need to study about reserved keyword. thank you again! – Centell May 19 '17 at 02:36
2

Variables are the right way to go. You can simplify your query:

select m.Month, m.cnt,
     (@running_total := (@running_total + m.cnt) ) as cumulative_sum
from (select month(created_at) as Month, count(*) as cnt
      from users 
      where year(created_at) in (2016, 2017)
      group by month(created_at)
     ) m cross join
     (select @running_total := 0) params
order by m.Month;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • 1
    Thank you. your code is more beautiful than me. but the result is same with my code.. – Centell May 18 '17 at 02:25
  • 1
    @Centell . . . There was a typo, `@runnung_total` rather than `running_total`. That appears to be the same problem with your query as well. – Gordon Linoff May 18 '17 at 02:27
0

Starting with MySQL 8, the ideal approach to calculate cumulative sums is by using SQL standard window functions rather than the vendor-specific, and not stricly declarative approach of using local variables. Your query can be written as follows:

WITH data(month, count) AS (
  SELECT date_format(create_at, '%m') AS month, count(*) AS count
  FROM users
  GROUP BY date_format(create_at, '%m')
)
SELECT
  month,
  count,
  sum(count) OVER (ORDER BY month) AS cumulative_sum
FROM data
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509