0

I'm using postgreSQL 9.2 I have a table with transactions that affect an account startbalance and I want to create a query that generates the monthly balances. It seems simple, but I'm stuck and the closest thing I can find might be this Calculate Monthly Recurring Revenue(MRR) result using postgres because it uses generate_series, which I think I need to create a list of the months to match up my accounts and transactions.

Here is the DDL for my tables:

CREATE TABLE account (
  id integer NOT NULL,
  startbalance numeric(19,2),
  opendate date NOT NULL
);

CREATE TABLE trx (
 id integer NOT NULL,
 accountid integer NOT NULL,
 amount numeric(19,2) NOT NULL,
 transactiondate date NOT NULL
);
ALTER TABLE ONLY account
 ADD CONSTRAINT account_pk PRIMARY KEY (id);

ALTER TABLE ONLY trx
 ADD CONSTRAINT transaction_pk PRIMARY KEY (id);

ALTER TABLE ONLY trx
 ADD CONSTRAINT transaction_accountid_fk FOREIGN KEY (accountid) REFERENCES account(id);

And the data:

INSERT INTO account VALUES (1, 200.00, '2016-01-01');
INSERT INTO account VALUES (2, 400.00, '2016-03-02');
INSERT INTO account VALUES (3, 800.00, '2016-01-15');

INSERT INTO trx VALUES (1, 1, -100.00, '2016-01-10');
INSERT INTO trx VALUES (6, 2, -200.00, '2016-03-25');
INSERT INTO trx VALUES (9, 3, -400.00, '2016-02-03');
INSERT INTO trx VALUES (8, 2, -100.00, '2016-09-15');
INSERT INTO trx VALUES (5, 1, -20.00, '2016-06-20');
INSERT INTO trx VALUES (3, 1, -20.00, '2016-03-04');
INSERT INTO trx VALUES (2, 1, -10.00, '2016-02-02');
INSERT INTO trx VALUES (10, 3, -100.00, '2016-04-12');
INSERT INTO trx VALUES (11, 3, -100.00, '2016-04-25');
INSERT INTO trx VALUES (12, 3, -200.00, '2016-04-29');
INSERT INTO trx VALUES (4, 1, -50.00, '2016-06-05');
INSERT INTO trx VALUES (7, 2, -100.00, '2016-08-05');

I've figured out how to use the window function to get running totals for each transaction:

SELECT 
  acct.id accountid,
  acct.startbalance,
  amount,
  transactiondate,
  acct.startbalance + sum(amount) OVER (PARTITION BY acct.id ORDER BY transactiondate) balance
FROM 
account acct 
LEFT JOIN trx ON trx.accountid = acct.id 
ORDER BY transactiondate DESC

This gives me:

accountid opendate startbalance amount transactiondate balance
2   2016-03-02  400.00  -100.00 2016-09-15  0.00
2   2016-03-02  400.00  -100.00 2016-08-05  100.00
1   2016-01-01  200.00  -20.00  2016-06-20  0.00
1   2016-01-01  200.00  -50.00  2016-06-05  20.00
3   2016-01-15  800.00  -200.00 2016-04-29  0.00
3   2016-01-15  800.00  -100.00 2016-04-25  200.00
3   2016-01-15  800.00  -100.00 2016-04-12  300.00
2   2016-03-02  400.00  -200.00 2016-03-25  200.00
1   2016-01-01  200.00  -20.00  2016-03-04  70.00
3   2016-01-15  800.00  -400.00 2016-02-03  400.00
1   2016-01-01  200.00  -10.00  2016-02-02  90.00
1   2016-01-01  200.00  -100.00 2016-01-10  100.00

but it doesn't show anything for the months with no data. What I need is something more like a resultset that shows data for each month in a range (e.g. 1/2016 - 6/2016):

MonthOf AcctId  OrigBal Pmt     Balance
2016-06     3   800     0       0
2016-06     2   400     0       200
2016-06     1   200     -70     0
2016-05     3   800     0       0
2016-05     2   400     0       200
2016-05     1   200     0       70
2016-04     3   800     -400    0
2016-04     2   400     0       200
2016-04     1   200     0       70
2016-03     3   800     -400    400
2016-03     2   400     -200    200
2016-03     1   200     -20     70
2016-02     3   800     -400    400
2016-02     2   0       0       0
2016-02     1   200     -10     90
2016-01     3   800     0       800
2016-01     2   0       0       0
2016-01     1   200     -100    100

I tried examples joining a CTE with a generate_series statement, but couldn't get the expected results. Below is the generate_series result I tried to put in a cte. I couldn't get the balances to add up correctly. I'm new to window functions, so I'm sure there's just something I'm missing or maybe I'm just working down the wrong path.

SELECT  
date_trunc('month', dates) + INTERVAL '1 MONTH - 1 SECOND' end_of_month
FROM 
generate_series('2016-01-01'::timestamp, '2016-6-24'::timestamp, '1 month'::interval) dates
Community
  • 1
  • 1
Chad
  • 3
  • 2

2 Answers2

0

Without a window function (maybe a bit ugly, but it appears to work ...) :

Note: I am ignoring the account.opendate, since it is not relevant for the summing.


UPDATE: unborn accounts are handled in the WHILE clause, now


WITH cal AS ( -- calender table
        SELECT generate_series('2016-01-01'::date, '2016-10-01'::date , '1 month'::interval)::date dt
        )
SELECT
        a.id, a.startbalance
        , c.dt
        , COALESCE(t.cumsum, 0.0)::NUMERIC(19,2) AS cumsum
        , a.startbalance + COALESCE(t.cumsum, 0.0)::NUMERIC(19,2) AS balance
FROM cal c
CROSS JOIN account a
LEFT JOIN LATERAL ( -- aggregating subquery
        SELECT accountid
        , SUM(amount) AS cumsum
        FROM trx
        WHERE transactiondate < c.dt
        GROUP BY accountid
        ) t ON t.accountid = a.id
WHERE a.opendate <= c.dt -- suppress unborn accounts
ORDER BY a.id, c.dt
        ;

Update: we can avoid the LATERAL by an ordinary old-fashioned scalar subquery.


WITH cal AS ( -- calender table
    SELECT generate_series('2016-01-01', '2016-10-01' , '1 month'::interval)::date dt
    )
, xxx AS ( -- grid table
    SELECT a.startbalance , a.id AS accountid , c.dt AS xdt
    FROM cal c -- in fact a cross join ...
    JOIN account a ON a.opendate <= c.dt
    )
SELECT
    x.accountid, x.startbalance, x.xdt
    , (SELECT SUM(t.amount) FROM trx t
        WHERE t.accountid = x.accountid
        AND t.transactiondate <= x.xdt) AS cumsum
    , COALESCE(x.startbalance
             + (SELECT SUM(t.amount) FROM trx t
               WHERE t.accountid = x.accountid
               AND t.transactiondate <= x.xdt)
         , x.startbalance) AS balance
FROM xxx x
ORDER BY x.accountid, x.xdt
        ;
wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • That's a great answer, if I was using 9.3+ (when lateral joins were introduced). I'm stuck with 9.2. Thanks in Advance, if you can come up with something else! – Chad Oct 26 '16 at 15:30
  • Ah! I was just posting a solution I came up with. Our solutions are different, but yours seems more simple with the subqueries in the columns. I was trying too hard to think outside the box. -edit, the results from this query are slightly different because I wanted to get all payments for that month. Changing the subquery filter to : date_trunc('month',t.transactiondate) <= x.xdt for both columns did the trick. Thanks again! – Chad Oct 26 '16 at 19:04
0

This has given me results I can work with:

WITH dates AS (

    SELECT  
    date_trunc('month', dates) + INTERVAL '1 MONTH - 1 SECOND' end_of_month
    FROM 
    generate_series('2016-01-01'::timestamp, '2016-6-24'::timestamp, '1 month'::interval) dates
)


SELECT 
date_trunc('month', dates.end_of_month) endofmonth,
acct.id accountid,
acct.opendate,
acct.startbalance,
acct.startbalance + (SUM(sum(COALESCE(trx.amount, 0))) OVER (PARTITION BY acct.id ORDER BY dates.end_of_month)) balance
FROM 
dates
LEFT join account acct ON date_trunc('month', dates.end_of_month) >= date_trunc('month',acct.opendate)
LEFT JOIN trx ON trx.accountid = acct.id AND date_trunc('month', trx.transactionDate) = date_trunc('month', dates.end_of_month) 
GROUP BY 
dates.end_of_month,
acct.id,
acct.opendate,
acct.startbalance
ORDER BY endofmonth DESC, acct.id DESC
Chad
  • 3
  • 2