Sum of cummulative difference within group in postgres

Question

I have a table that looks like the one below. I will like to find the sum of the cumulative difference between the target value daily actual value.

ID  | Date    | Target_value | Daily_Value
1   |01/10/20 |   200        |   5
2   |01/10/20 |   500        |   2
3   |05/10/20 |   600        |   10
1   |04/11/20 |   200        |   50
2   |05/11/20 |   500        |   80
3   |05/11/20 |   600        |   40
1   |06/12/20 |   200        |   50
4   |06/12/20 |   400        |   30
5   |07/12/20 |   300        |   20

Expected output

Date     | Target_value - monthly_cummulative daily_value          |
   10/20 | (200 + 500 + 600) - (5 + 2 + 10) =   1283               |
   11/20 | (200 + 500 + 600) - (17 + 50 + 80 + 40) = 1113          |
   12/20 | (200 + 500 + 600 + 400 + 300) - (17 + 170 + 100) = 1713 |

This is similar to this Calculating Cumulative Sum in PostgreSQL but not exactly the same.

Tim Biegeleisen · Answer 1 · 2021-12-13T15:13:44.590

1

We can do this in two steps. First, aggregate by year and month and generate the sums for the target and daily values. Then, use SUM() as an analytic function with a rolling window across the entire intermediate table to generate the difference.

WITH cte AS (
    SELECT DATE_TRUNC('month', Date), SUM(Target_value) AS Target_value,
           SUM(Daily_Value) AS Daily_Value
    FROM yourTable
    GROUP BY 1
)

SELECT ym, Target_value,
       Target_Value - SUM(Daily_Value) OVER (ORDER BY ym) AS output
FROM cte
ORDER BY ym;

Demo

edited Dec 13 '21 at 15:13

answered Dec 13 '21 at 15:05

Tim Biegeleisen

502,043
27
286
360

Cool. Hadn't used date_trunc before. – xQbert Dec 13 '21 at 15:13
`DATE_TRUNC('month', Date),` in the CTE needs to be `DATE_TRUNC('month', Date) ym,` like how you handled the cumulative sum. – xQbert Dec 13 '21 at 15:23

LukStorms · Answer 2 · 2021-12-13T19:04:48.077

Group by the truncated dates.

Then Sum over the daily sum.

But the targets need to be handled seperatly.

WITH CTE_TARGETS AS (
  SELECT ID
  , MAX(Target_Value) AS Target_Value
  , MIN(DATE_TRUNC('month', Date)) as month_first
  FROM your_table
  GROUP BY ID
), CTE_MONTHLY AS
(
  SELECT 
    DATE_TRUNC('month', Date) AS month_first
  , SUM(SUM(Daily_Value)) OVER (ORDER BY DATE_TRUNC('month', Date)) AS month_daily
  FROM your_table t
  GROUP BY DATE_TRUNC('month', Date)
) 
SELECT 
  TO_CHAR(mon.month_first, 'MM/YY') AS Month
, SUM(Target_Value) - month_daily AS monthly_cummulative
FROM CTE_MONTHLY mon
JOIN CTE_TARGETS tgt ON tgt.month_first <= mon.month_first
GROUP BY mon.month_first, month_daily
ORDER BY mon.month_first

month	monthly_cummulative
10/20	1283
11/20	1113
12/20	1713

db<>fiddle here

@TimBiegeleisen Oops, you are correct sir. Thx for pointing it out. Fixed now. — LukStorms, Dec 13 '21 at 15:38

score 0 · Answer 3 · edited Dec 14 '21 at 10:26

Unlike Postgres, AWS Redshift requires a frame clause when using "over (order by ...)".

The following is the updated version of LukStorms' answer.

SELECT 
  TO_CHAR(DATE_TRUNC('month', date), 'MM/YY') AS MonthYear
, SUM(Target_value) 
  - SUM(SUM(Daily_Value)) OVER (ORDER BY MonthYear ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS monthly_cumulative
FROM yourtable
GROUP BY MonthYear
ORDER BY MonthYear;

The frame clause added here is "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW".

For conciseness, I have also replaced all but the first use of DATE_TRUNC by the alias MonthYear.

Sum of cummulative difference within group in postgres

3 Answers3

Demo

Linked