2

I have a table that looks like the one below. I will like to find the sum of the cumulative difference between the target value daily actual value.

ID  | Date    | Target_value | Daily_Value
1   |01/10/20 |   200        |   5
2   |01/10/20 |   500        |   2
3   |05/10/20 |   600        |   10
1   |04/11/20 |   200        |   50
2   |05/11/20 |   500        |   80
3   |05/11/20 |   600        |   40
1   |06/12/20 |   200        |   50
4   |06/12/20 |   400        |   30
5   |07/12/20 |   300        |   20

Expected output

Date     | Target_value - monthly_cummulative daily_value          |
   10/20 | (200 + 500 + 600) - (5 + 2 + 10) =   1283               |
   11/20 | (200 + 500 + 600) - (17 + 50 + 80 + 40) = 1113          |
   12/20 | (200 + 500 + 600 + 400 + 300) - (17 + 170 + 100) = 1713 |


This is similar to this Calculating Cumulative Sum in PostgreSQL but not exactly the same.

SAB6
  • 311
  • 2
  • 6

3 Answers3

1

We can do this in two steps. First, aggregate by year and month and generate the sums for the target and daily values. Then, use SUM() as an analytic function with a rolling window across the entire intermediate table to generate the difference.

WITH cte AS (
    SELECT DATE_TRUNC('month', Date), SUM(Target_value) AS Target_value,
           SUM(Daily_Value) AS Daily_Value
    FROM yourTable
    GROUP BY 1
)

SELECT ym, Target_value,
       Target_Value - SUM(Daily_Value) OVER (ORDER BY ym) AS output
FROM cte
ORDER BY ym;

screenshot from demo link below

Demo

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0

Group by the truncated dates.

Then Sum over the daily sum.

But the targets need to be handled seperatly.

WITH CTE_TARGETS AS (
  SELECT ID
  , MAX(Target_Value) AS Target_Value
  , MIN(DATE_TRUNC('month', Date)) as month_first
  FROM your_table
  GROUP BY ID
), CTE_MONTHLY AS
(
  SELECT 
    DATE_TRUNC('month', Date) AS month_first
  , SUM(SUM(Daily_Value)) OVER (ORDER BY DATE_TRUNC('month', Date)) AS month_daily
  FROM your_table t
  GROUP BY DATE_TRUNC('month', Date)
) 
SELECT 
  TO_CHAR(mon.month_first, 'MM/YY') AS Month
, SUM(Target_Value) - month_daily AS monthly_cummulative
FROM CTE_MONTHLY mon
JOIN CTE_TARGETS tgt ON tgt.month_first <= mon.month_first
GROUP BY mon.month_first, month_daily
ORDER BY mon.month_first
month monthly_cummulative
10/20 1283
11/20 1113
12/20 1713

db<>fiddle here

LukStorms
  • 28,916
  • 5
  • 31
  • 45
0

Unlike Postgres, AWS Redshift requires a frame clause when using "over (order by ...)".

The following is the updated version of LukStorms' answer.

SELECT 
  TO_CHAR(DATE_TRUNC('month', date), 'MM/YY') AS MonthYear
, SUM(Target_value) 
  - SUM(SUM(Daily_Value)) OVER (ORDER BY MonthYear ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS monthly_cumulative
FROM yourtable
GROUP BY MonthYear
ORDER BY MonthYear;

The frame clause added here is "ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW".

For conciseness, I have also replaced all but the first use of DATE_TRUNC by the alias MonthYear.

Danny Sandi
  • 657
  • 7
  • 27