3

Consider the following 2 tables:

TABLE A:

PIN | ENCOUNTER | BALANCE | REFERENCE_DATE 
------------------------------------------
P1  | ABC       | 100     | 11-19-2014     
P1  | HJI       | 300     | 11-20-2014     
P1  | PIY       | 700     | 11-21-2014     
P2  | CDO       | 200     | 11-20-2014     
P2  | NHG       | 200     | 11-21-2014    
P3  | CVB       | 500     | 11-20-2014
P3  | SJK       | 100     | 11-21-2014     

TABLE B:

PIN | DEPOSIT
-------------
P1  | 1000
P2  | 400
P3  | 100

Initially Table B's DEPOSIT value is to be subtracted from the BALANCE in table A with the earliest REFERENCE_DATE matched by the PIN. Should the difference be greater than 0, it would be subtracted from the BALANCE in the next row until the remaining DEPOSIT becomes less than or equal to 0.

The result after subtracting the deposit from the balances would look like this. I've included another column wherein the deposits are divided per encounter:

PIN | ENCOUNTER | BALANCE | REFERENCE_DATE | DEPOSITS_BREAKDOWN
---------------------------------------------------------------
P1  | ABC       | 0       | 11-19-2014     | 100
P1  | HJI       | 0       | 11-20-2014     | 300
P1  | PIY       | 100     | 11-21-2014     | 600
P2  | CDO       | 0       | 11-20-2014     | 200
P2  | NHG       | 0       | 11-21-2014     | 200
P3  | CVB       | 400     | 11-20-2014     | 100
P3  | SJK       | 100     | 11-21-2014     | 0

My Postgres version is 9.3. I'm struggling to formulate the query for this one.

dilm
  • 687
  • 2
  • 7
  • 14
  • The last result row should probably be 500, not 100. – Erwin Brandstetter Nov 26 '14 at 08:49
  • @ErwinBrandstetter, if so, then I have wasted 30 minutes. This `100` is a bit tough to deal with. – vyegorov Nov 26 '14 at 08:58
  • Actually the last row is really 100. the 100 deposit allocated for P3 rows in table A is already subtracted to the row with CVB encounter. – dilm Nov 26 '14 at 09:28
  • Thank you so much sir. But is it possible to also list in another column how much the deposits were divided? I've updated the result table. – dilm Nov 27 '14 at 05:40
  • Yes, that's cheaply possible. Aside: are you aware that without @-reply nobody was notified of your comment? Found it by chance. – Erwin Brandstetter Nov 27 '14 at 06:13
  • @ErwinBrandstetter I wasn't aware of using @-reply until now. Thanks! I'm beginning to learn window functions because of you answer. Last question regarding the clause, ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING. Is it like referring to the rows before the current row? – dilm Nov 27 '14 at 07:12
  • Yes. I added another link to an answer with more details. – Erwin Brandstetter Nov 27 '14 at 08:27

2 Answers2

2

Set to 0 as far as DEPOSIT covers BALANCE

As you clarified, you don't want a running sum of BALANCE, just set to 0 until DEPOSIT is spent:

SELECT PIN, ENCOUNTER
     , CASE WHEN last_sum >= DEPOSIT THEN BALANCE
            ELSE GREATEST (last_sum + BALANCE - DEPOSIT, 0) END AS BALANCE
     , REFERENCE_DATE
     , CASE WHEN last_sum >= DEPOSIT THEN 0
            ELSE LEAST (BALANCE, DEPOSIT - last_sum) END AS DEPOSITS_BREAKDOWN
FROM (
   SELECT a.*
        , COALESCE(sum(a.BALANCE) OVER (
                         PARTITION BY PIN ORDER BY a.REFERENCE_DATE
                         ROWS BETWEEN UNBOUNDED PRECEDING
                                          AND 1 PRECEDING), 0) AS last_sum
        , COALESCE(b.DEPOSIT, 0) AS DEPOSIT
   FROM        table_a a
   LEFT   JOIN table_b b USING (pin)
   ) sub;

Returns your desired result exactly.

SQL Fiddle.

  • I adopted the idea for the simpler join from @vyegorov as commented.

  • LEFT JOIN to table_b - that leaves the possibility that no row in table_b might be found.

  • In the subquery, compute the running sum of BALANCE up until the last row (last_sum). Use a custom frame in the window function for this. And COALESCE to default to 0 where there are not rows. Related answers with more explanation for the custom frame:

  • In the final SELECT, return original BALANCE if last_sum is equal or greater than DEPOSIT (it has been spent). ELSE return the remaining difference or 0 is the running sum of BALANCE (last_sum + BALANCE) is smaller than DEPOSIT.

Running sum

Previous (simpler) answer with the BALANCE as running sum (last row 500 instead of 100):

SELECT a.PIN, a.ENCOUNTER
     , GREATEST(sum(a.BALANCE) OVER (PARTITION BY PIN ORDER BY a.REFERENCE_DATE) 
                 - COALESCE(b.DEPOSIT, 0), 0) AS BALANCE
     , a.REFERENCE_DATE
FROM   table_a      a
LEFT   JOIN table_b b USING (pin);
Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

I came up with this query:

SELECT *,
       sum(balance) OVER w                          balance_accum,
       greatest(deposit - sum(balance) OVER w, 0)   deposit_new,
       greatest(sum(balance) OVER w - deposit, 0)   balance_new
  FROM table_a JOIN table_b USING(pin)
WINDOW w AS (PARTITION BY pin ORDER BY reference_date)
 ORDER BY pin, reference_date;

SQL Fiddle

As Erwin mentioned, this one assumes last row contains 500 rather then 100.


EDIT:

And this query produces the desired output:

SELECT s.*,
       CASE WHEN min(deposit_new) OVER w = 0 THEN 0 
            ELSE least(min(deposit_new) OVER w, deposit_diff) END     deposit_used,
       balance -
         CASE WHEN min(deposit_new) OVER w = 0 THEN 0
              ELSE least(min(deposit_new) OVER w, deposit_diff) END   balance_real
  FROM
  (
    SELECT *,
           sum(balance) OVER w                                      balance_accum,
           greatest(coalesce(deposit,0) - sum(balance) OVER w, 0)   deposit_new,
           least(balance, coalesce(deposit,0))                      deposit_diff
      FROM table_a LEFT JOIN table_b USING(pin)
    WINDOW w AS (PARTITION BY pin ORDER BY reference_date)
  ) s
WINDOW w AS (PARTITION BY pin ORDER BY reference_date
             ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
 ORDER BY pin, reference_date;

What's on here (subquery):

  • as Erwin suggests, LEFT JOIN and coalesce(deposit,0) are used to keep entries without deposits;
  • running total of actual balance values is computed and substracted from deposit (column deposit_new in the output);
  • deposit_diff is a least of balance and deposit, it is used to adjust balances in the outer part.

In the outer part:

  • check minimal deposit_new value and if 0 is reached, then all further "usage" is skipped;
  • otherwise take least of deposit_new and deposit_diff values;
  • checks are done for all preceding rows in the group.

Subquery is required for I have to use results of the window functions in the logic.

SQL Fiddle 2

vyegorov
  • 21,787
  • 7
  • 59
  • 73