1

I have the following sample data:

order_id    receipt_id    receipt_amount    total_already_invoiced_amount    
14          36            30                150
14          37            30                150
15          42            30                30
16          45            30                60
16          46            40                60
17          50            40                60
17          51            40                60
17          52            40                60

The column receipt_amount is the amount of an order received for that specific line. The column total_already_invoiced_amount is the total amount invoiced for an order.

I want to transform this table into a new one which retains only the lines where there is a received amount which is remaining after deducting the total invoiced amount (first in first out). For example, if I have 3 receipt lines, each of 40, and my total invoiced is 60, then I can figure out that the first receipt line is fully invoiced, the second receipt line has 20 remaining to be invoiced and the third one has not been invoiced at all. I cannot aggregate, I must keep the receipt_id as an index as these can have different dates and I need to be able to distinguish according to that. The result of such query would be the following:

order_id    receipt_id    received_not_invoiced_amount    
16          46            10
17          51            20
17          52            40

I understand I can select group by order_id to get the aggregated receipt_amount, but it will also aggregate the total_already_invoiced_amount, which is not what I want. I am trying the following but that will not perform the FIFO calculation....

 SELECT order_id, 
      receipt_id, 
     (total_already_invoiced_amount - 
          (SELECT receipt_amount FROM X GROUP BY order_id)
     ) total_already_invoiced_amount
 FROM X
 WHERE (total_already_invoiced_amount - 
          (SELECT receipt_amount FROM X GROUP BY order_id)) < 0

I'm a bit lost of where to start with to make this work.

Vincent Teyssier
  • 2,146
  • 5
  • 25
  • 62

2 Answers2

1

You want a cumulative sum:

select order_id, receipt_id,
       least(running_ra, total_already_invoiced_amount), receipt_amount)
from (select x.*,
             sum(receipt_amount) over (partition by order_id order by receipt_id) as running_ra
      from x
     ) x
where running_ra > total_already_invoiced_amount
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

In the absence of Windowing functions (not available in MySQL 5.7), one approach is to do a Self-Join and compute Sum of all the receipts for the order, until the receipt row of the first table. We can then use conditional statements to determine the differences accordingly:

Query #1 View on DB Fiddle

SELECT t1.order_id,
       t1.receipt_id,
       CASE
         WHEN Coalesce(Sum(t2.receipt_amount), 0) <=
              t1.total_already_invoiced_amount
       THEN 0
         ELSE Least(Coalesce(Sum(t2.receipt_amount), 0) -
                    t1.total_already_invoiced_amount,
                     t1.receipt_amount)
       end AS received_not_invoiced_amount
FROM   X t1
       LEFT JOIN X t2
              ON t2.order_id = t1.order_id
                 AND t2.receipt_id <= t1.receipt_id
GROUP  BY t1.order_id,
          t1.receipt_id,
          t1.receipt_amount,
          t1.total_already_invoiced_amount
HAVING received_not_invoiced_amount > 0;

| order_id | receipt_id | received_not_invoiced_amount |
| -------- | ---------- | ---------------------------- |
| 16       | 46         | 10                           |
| 17       | 51         | 20                           |
| 17       | 52         | 40                           |

For good performance, you can define the following composite index: (order_id, receipt_id).


Another approach is using User-defined Variables. It is like a looping technique, where we calculate rolling (cumulative) sum over order_id, as we move down the receipts. Based on the sum, we determine whether excess payment received or not accordingly. For more detailed explanation on this technique, you may check this answer: https://stackoverflow.com/a/53465139

Query #2 View on DB Fiddle

SELECT order_id,
       receipt_id,
       received_not_invoiced_amount
FROM   (SELECT @s := IF(@o = order_id, @s + receipt_amount, receipt_amount) AS
               cum_receipt_amount,
               IF(@s <= total_already_invoiced_amount, 0,
               Least(@s - total_already_invoiced_amount, receipt_amount))   AS
               received_not_invoiced_amount,
               @o := order_id                                               AS
               order_id
                      ,
               receipt_id
        FROM   (SELECT *
                FROM   X
                ORDER  BY order_id,
                          receipt_id) t1
               CROSS JOIN (SELECT @o := 0,
                                  @s := 0) vars) t2
WHERE  received_not_invoiced_amount > 0;

| order_id | receipt_id | received_not_invoiced_amount |
| -------- | ---------- | ---------------------------- |
| 16       | 46         | 10                           |
| 17       | 51         | 20                           |
| 17       | 52         | 40                           |

For good performance, you can define the same composite index: (order_id, receipt_id).


You may benchmark both the approaches for best performance.

Madhur Bhaiya
  • 28,155
  • 10
  • 49
  • 57
  • Awesome! Beautiful query. I've applied some more complex examples and it works like a charm! – Vincent Teyssier Oct 04 '19 at 08:43
  • 1
    @VincentTeyssier you may check the updated answer for two query approaches for MySQL 5.7 (MySQL 8 will allow us to use Windowing functions as well). You *may* benchmark and decide the best query for you. – Madhur Bhaiya Oct 04 '19 at 09:06