0

I am quite new with postgreSQL. I am trying to get something that initially should not be difficult, but I am struggling. I appreciate any suggestion.

The main idea is to calculate the actual quantity based on current difference (po-usage) for the current month plus the accumulative from the previous months. Each month is having the quantity of purchasing order (po) and usage (usage) for two categories. The issue come from a table que contain several categories and data, but I summarized for two categories due to that I am having the same issue. This is the example table:

 create table foo as select * from (values

  ('10/21/2014', 'Catb', 4, 3),
  ('10/22/2014', 'Catb', 6, 4),
  ('10/23/2014', 'Catb', 7, 1),
  ('10/24/2014', 'Catb', 8, 5),
  ('10/25/2014', 'Catb', 4, 6),
  ('10/26/2014', 'Catb', 2, 7),
  ('10/27/2014', 'Catb', 3, 6),
  ('10/27/2014', 'Catb', 8 , 8),
  ('10/28/2014', 'Cate', 7 , 3),
  ('10/29/2014', 'Cate', 3, 5),
  ('10/30/2014', 'Cate', 20 , 10)         
  )
as z(Date, cat, po,usage )

The intended solution table should be like this:

+------------+------+----+-------+--------------------+--------------+
|    date    | cat  | po | usage | diffmonth=po-usage | accumulative |
+------------+------+----+-------+--------------------+--------------+
| 21/10/2014 | catb |  4 |     3 |                  1 |              |
| 22/10/2014 | catb |  6 |     4 |                  2 |            2 |
| 23/10/2014 | catb |  7 |     1 |                  6 |            8 |
| 24/10/2014 | catb |  8 |     5 |                  3 |           11 |
| 25/10/2014 | catb |  4 |     6 |                 -2 |            9 |
| 26/10/2014 | catb |  2 |     7 |                 -5 |            4 |
| 27/10/2014 | catb |  3 |     6 |                 -3 |            1 |
| 27/10/2014 | catb |  8 |     8 |                  0 |            1 |
| 28/10/2014 | cate |  7 |     3 |                  4 |            4 |
| 29/10/2014 | cate |  3 |     5 |                 -2 |            2 |
| 30/10/2014 | cate | 20 |    10 |                 10 |            8 |
+------------+------+----+-------+--------------------+--------------+

I have built the following query, but it seems that is not good enough. I think, if I am not wrong with the approach, it needs to be calculate based on current accumulative plus previous accumulative month.


    with X3 as
(
        SELECT *
       ,po-usage as diffmonth
       ,po-usage as diffAcc

FROM foo
order by date asc
)

select
        foo.date
        ,foo.cat
        ,foo.po
        ,foo.usage
        ,x3.diffmonth
        ,lag(x3.diffAcc) over (partition by x3.cat order by x3.date) as LagDiffmonth
        ,x3.diffmonth + lag(x3.diffAcc) over (partition by x3.cat order by x3.date) as Accumulative
from foo
inner join X3 on foo.cat = x3.cat and foo.date = x3.date

Where I have got as follow:

+------------+------+----+-------+-----------+--------------+--------------+
|    date    | cat  | po | usage | diffmonth | lagdiffmonth | accumulative |
+------------+------+----+-------+-----------+--------------+--------------+
| 10/21/2014 | Catb |  4 |     3 |         1 | (null)       | (null)       |
| 10/22/2014 | Catb |  6 |     4 |         2 | 1            | 3            |
| 10/23/2014 | Catb |  7 |     1 |         6 | 2            | 8            |
| 10/24/2014 | Catb |  8 |     5 |         3 | 6            | 9            |
| 10/25/2014 | Catb |  4 |     6 |        -2 | 3            | 1            |
| 10/26/2014 | Catb |  2 |     7 |        -5 | -2           | -7           |
| 10/27/2014 | Catb |  3 |     6 |        -3 | -5           | -8           |
| 10/27/2014 | Catb |  3 |     6 |         0 | -3           | -3           |
| 10/27/2014 | Catb |  8 |     8 |        -3 | 0            | -3           |
| 10/27/2014 | Catb |  8 |     8 |         0 | -3           | -3           |
| 10/28/2014 | Cate |  7 |     3 |         4 | (null)       | (null)       |
| 10/29/2014 | Cate |  3 |     5 |        -2 | 4            | 2            |
| 10/30/2014 | Cate | 20 |    10 |        10 | -2           | 8            |
+------------+------+----+-------+-----------+--------------+--------------+
Belayer
  • 13,578
  • 2
  • 11
  • 22
edw
  • 1
  • Welcome to the SO community. Glad to see text instead of images. Now time to take the next step - format the text. One method is using [Senseful Solutions](https://senseful.github.io/text-table/). Create a tab separated list, past that into the input box click `create table`. Finally copy and paste the output box in your question between lines containing just ```. – Belayer Feb 14 '21 at 20:24
  • @edw: why doesn't `accumulative` start at `1` for `catb`? It does start at `4` for `cate`. Shouldn't the last `accumulative` value for `cate` be `12` too? – Marth Feb 14 '21 at 20:54
  • What you're looking for is "cumulative sum", see [here](https://stackoverflow.com/questions/22841206/calculating-cumulative-sum-in-postgresql) – bobflux Feb 14 '21 at 21:51

1 Answers1

0

You are actually pretty close. Just use the windowed variant of the sum function rather that lag.

with X3 (date, cat, po, usage, diffmonth ) as 
     ( select date, cat, po,usage, po-usage  
         from foo
     ) 
select x3.*
     , coalesce(sum(diffmonth) over (partition by cat order by date), 0) accum_diff 
  from x3;

This can be reduced to a single simple select.

select *, po-usage diffmonth 
     , coalesce(sum(po-usage) over (partition by cat order by date), 0) accum_diff 
  from foo;

BTW: I would advise against using Date as a column name. If is a reserved word and a data type. You can get away with it now but being a reserved word Postgres can enforce a predefined meaning at any time. This would lead to extremely difficult to find errors and/or bugs. Apply this: Do not use reserved words and data types as object names.

Belayer
  • 13,578
  • 2
  • 11
  • 22