0

Here's my sample data (table "sumtest"):

+-------+--------+-------+
| month |  value |  year |
+-------+--------+-------+
|     1 |     10 |  2017 |
|     2 |      0 |  2017 |
|     2 |     10 |  2016 |  # note: different year
|     2 |      5 |  2017 |
|     3 |     88 |  2017 |
|     3 |      2 |  2017 |
|     5 |      1 |  2017 |
|     5 |      4 |  2017 |
|     5 |      5 |  2017 |
+-------+--------+-------+

I'd like to get the total value for each month, and the running total for each year in that particular month, i.e. I'd like my result to be this:

+------+-------+-----------+----------+
| year | month | sum_month | sum_year |
+------+-------+-----------+----------+
| 2016 |     2 |        10 |       10 |
| 2017 |     1 |        10 |       10 |
| 2017 |     2 |         5 |       15 |
| 2017 |     3 |        90 |      105 |
| 2017 |     5 |        10 |      115 |
+------+-------+-----------+----------+

I'm new to Postgres and I've tried the following:

    SELECT *, sum(value) OVER (PARTITION BY month, year) AS sum_month, 
sum(value) OVER (PARTITION BY year) AS sum_year
    FROM sumtest
    ORDER BY year, month

But this yields one line for each original entry, and the total yearly sum listed on each line instead of the cumulative sum up to this point:

+-------+-------+------+-----------+----------+
| month | value | year | sum_month | sum_year |
+-------+-------+------+-----------+----------+
|     2 |    10 | 2016 | '10'      | '10'     |
|     1 |    10 | 2017 | '10'      | '115'    |
|     2 |     5 | 2017 | '5'       | '115'    |
|     2 |     0 | 2017 | '5'       | '115'    |
|     3 |     2 | 2017 | '90'      | '115'    |
|     3 |    88 | 2017 | '90'      | '115'    |
|     5 |     4 | 2017 | '10'      | '115'    |
|     5 |     1 | 2017 | '10'      | '115'    |
|     5 |     5 | 2017 | '10'      | '115'    |
+-------+-------+------+-----------+----------+

I've also tried using GROUP BY, which worked for the cumulative sums of the months, but then I did not now how to include the running total of the year (as that is not supposed to be grouped by month).

Any help would be greatly appreciated.

canavanin
  • 2,559
  • 3
  • 25
  • 36
  • Possible Duplicate : https://stackoverflow.com/questions/22841206/calculating-cumulative-sum-in-postgresql – Andrew Jan 08 '18 at 11:46
  • @Andrew That question is where I got the PARTITION BY-statement from, but I also need the grouping. – canavanin Jan 08 '18 at 11:53

2 Answers2

4

You could add MAX and GROUP BY above your query, along with ORDER BY inside OVER()

select year,month,MAX( sum_month) sum_month,  MAX(sum_year) sum_year 
FROM
(
 SELECT *, sum(value) OVER (PARTITION BY month, year ORDER BY year,month) AS sum_month, 
sum(value) OVER (PARTITION BY year ORDER BY year,month) AS sum_year
    FROM sumtest
)  a
GROUP BY year,month;

DEMO

Kaushik Nayak
  • 30,772
  • 5
  • 32
  • 45
  • Works very nicely, thanks! One more question though: why is ORDER BY required inside OVER()? Thanks. – canavanin Jan 08 '18 at 13:26
  • 1
    @canavanin: when you said running sum, you need a direction to run, don't you? Hence Order by specifies the order to evaluate the aggregation (SUM). If you don't specify the window clauses ROWS between, the default value with order by is running sum – Kaushik Nayak Jan 08 '18 at 13:50
1

prepare:

t=# create table s(a text,m int, v int, y int, b text);
CREATE TABLE
t=# copy s from stdin delimiter '|';
>> \.
COPY 9
t=# alter table s drop column a;
ALTER TABLE
t=# alter table s drop column b;
ALTER TABLE

query:

t=# select distinct y,m,sum(v) over (partition by m,y), sum(v) over (partition by y order by m) from s order by y,m;
  y   | m | sum | sum
------+---+-----+-----
 2016 | 2 |  10 |  10
 2017 | 1 |  10 |  10
 2017 | 2 |   5 |  15
 2017 | 3 |  90 | 105
 2017 | 5 |  10 | 115
(5 rows)

updated - I totally missed cumulative

Vao Tsun
  • 47,234
  • 13
  • 100
  • 132