1

I have a table that stores numeric balances in different buckets (bucket 1, 2 and 3). When an amount is posted in the system, the amount must be allocated to the buckets. The transaction type determines how it affects the buckets. Type 1, 2 and 3 increase the respective bucket (no waterfall) while type 0 decreases the buckets using a waterfall. This means the type 0 amount first reduces bucket 1, then any leftover reduces bucket 2 and any leftover reduces bucket 3.

I currently apply the amounts with a cursor that loops through each amount posted. Because of the waterfall, the sequence of the amounts being applied is key. Obviously when processing hundreds of thousands of amounts, it's quite slow. I'm wondering if there's a set based way to do this?

Code to generate tables and records:

http://pastebin.com/XgKrKkbm

My cursor that allocates the amounts is as follows:

DECLARE @Instrument int,
        @Type int,
        @Amount numeric(19,2),
        @NewAmount numeric(19,2),
        @Seq int,
        @Bucket1 numeric(19,2),
        @Bucket2 numeric(19,2),
        @Bucket3 numeric(19,2)

DECLARE Waterfall CURSOR Fast_Forward FOR 
        SELECT InstrumentID, TypeID, Amount, Sequence
        FROM Amount
        WHERE EffectiveDate between '2015-06-20' and '2015-06-30'
        ORDER BY InstrumentID, Sequence

OPEN Waterfall;
FETCH NEXT from Waterfall into @Instrument, @Type, @Amount, @Seq

WHILE @@FETCH_STATUS = 0
    BEGIN

    SELECT @Bucket1 = Bucket1, @Bucket2 = Bucket2, @Bucket3 = Bucket3
    FROM Buckets WHERE InstrumentID = @Instrument

        IF @Type > 0 /*Increase Buckets, no waterfall*/
        BEGIN
            SET @Bucket1 = CASE WHEN @Type = 1 THEN @Bucket1 + @Amount ELSE @Bucket1 END
            SET @Bucket2 = CASE WHEN @Type = 2 THEN @Bucket2 + @Amount ELSE @Bucket2 END
            SET @Bucket3 = CASE WHEN @Type = 3 THEN @Bucket3 + @Amount ELSE @Bucket3 END
        END

        ELSE /*Decrease buckets with waterfall*/
        BEGIN

            SET @NewAmount  = CASE WHEN @Amount >= @Bucket1 THEN @Amount - @Bucket1 ELSE 0 END
            SET @Bucket1    = CASE WHEN @Amount >= @Bucket1 THEN 0 ELSE @Bucket1 - @Amount END
            SET @Amount     = @NewAmount
            SET @NewAmount  = CASE WHEN @Amount >= @Bucket2 THEN @Amount - @Bucket2 ELSE 0 END
            SET @Bucket2    = CASE WHEN @Amount >= @Bucket2 THEN 0 ELSE @Bucket2 - @Amount END
            SET @Amount     = @NewAmount
            SET @Bucket3    = CASE WHEN @Amount >= @Bucket3 THEN 0 ELSE @Bucket3 - @Amount END
        END

/*Record effect of each amount on the balances for audit/undo*/
    UPDATE  Amount
    SET     Bucket1 = @Bucket1 - Buckets.Bucket1, 
            Bucket2 = @Bucket2 - Buckets.Bucket2, 
            Bucket3 = @Bucket3 - Buckets.Bucket3
    FROM    Buckets 
        inner join Amount 
            on Amount.InstrumentID = Buckets.InstrumentID
    where Sequence = @Seq

/*update bucket values in table*/    
    UPDATE Buckets
        SET Bucket1 = @Bucket1,
            Bucket2 = @Bucket2,
            Bucket3 = @Bucket3
    WHERE InstrumentID = @Instrument

FETCH NEXT from Waterfall INTO @Instrument, @Type, @Amount, @Seq
END

CLOSE Waterfall
DEALLOCATE Waterfall

Create and Insert scripts are at PasteBin as SO wouldn't let me post the question when they were included (text length?).


Example for clarity around waterfall sequence being critical:

Start with the following:

Instrument | Bucket1 | Bucket2 | Bucket 3
1          |     500 |     200 |     3000

Two amounts to be applied as follows

Instrument | Sequence|    Type |   Amount
1          |       1 |       0 |     800  /*this decreases buckets via waterfall*/
1          |       2 |       1 |     500  /*this increases bucket 1*/

If the amounts are applied sequence, the outcome is as follows:

Instrument | Bucket1 | Bucket2 | Bucket 3
1          |     500 |       0 |     2900

However the amounts are applied them in reverse order, the outcome is incorrect:

Instrument | Bucket1 | Bucket2 | Bucket 3
1          |     200 |     200 |     3000
p3pp
  • 45
  • 6
  • Your task is a variation of running totals. If you are using SQL Server 2012+, then it can do running totals efficiently. If you are using previous version, cursor would be more efficient. – Vladimir Baranov Aug 05 '15 at 02:16
  • Hi @VladimirBaranov, I am on 2012. I attempted this and it works for the increases, but I'm not sure how I would apply the waterfall decreases. My understanding is that you would have three `over` clauses - one for each bucket. These are great for summing based on other columns, but I'm not sure how you would reference the `over` clause as part of the sum (to determine if the bucket would be zero, and then to pass the amount to the next `over`). Maybe I'm approaching this the wrong way? – p3pp Aug 05 '15 at 04:12
  • I don't have practical experience in calculating running totals, but I would start with unpivoting your table: instead of one row with three columns for buckets I would convert that to three rows with one column for bucket value plus extra column for bucket number to know the ordering. This kind of unpivoting can be done on the fly, then performing the running addition/subtraction, then pivoting results back. – Vladimir Baranov Aug 05 '15 at 04:42

1 Answers1

0

Something like this is what you want. Much more efficient than a CURSOR.

UPDATE
  b
SET
  Bucket1 = CASE WHEN a.[Type] = 1 THEN b.Bucket1 + a.Amount
                 WHEN a.[Type] = 0 AND Amount >= b.Bucket1 THEN 0
                 WHEN a.[Type] = 0 AND Amount <  b.Bucket1 THEN b.Bucket1 - a.Amount
                 ELSE Bucket1  
            END,
  Bucket2 = CASE WHEN a.[Type] = 2 THEN b.Bucket2 + a.Amount
                 WHEN a.[Type] = 0 AND Amount >= b.Bucket1 + Bucket2 THEN 0
                  WHEN a.[Type] = 0 AND Amount BETWEEN  b.Bucket1 AND Bucket2 THEN Bucket2 - (a.Amount - b.Bucket1)
                 ELSE Bucket2
            END,
  Bucket3 = CASE WHEN a.[Type] = 3 THEN b.Bucket3 + a.Amount
                 WHEN a.[Type] = 0 AND Amount >= b.Bucket1 + Bucket2 THEN Bucket3  - (a.Amount - b.Bucket1 -  Bucket2)
                 ELSE Bucket3
            END
FROM
  dbo.Buckets AS b
  JOIN 
  (SELECT TOP 10000000 * FROM dbo.Amounts ORDER BY Sequence) AS a
    ON b.InstrumentID = a.InstrumentID
JohnS
  • 1,942
  • 1
  • 13
  • 16
  • Hi JohnS, thanks for your answer, but as far as I can tell this doesn't cover off the waterfall requirements for type 0 amounts. When I run your code bucket2 goes negative (while bucket3 stays positive). Only bucket 3 can ever go negative. Once bucket1 is 0, the rest goes to bucket2, and once bucket2 is zero the rest goes to bucket3. I've updated the question with an additional example showing the impact of sequence. – p3pp Aug 05 '15 at 00:25
  • @p3pp. I have debugged and updated my answer (some plus signs changed to minus). Your test case now works for me. Interestingly, the order of the updates is important as you have pointed out. So my new version JOINS to an ordered by sequence sub-query (which needed the TOP clause to work). – JohnS Aug 05 '15 at 01:41
  • @JohnS, I would be very interested to see a link to official documentation that confirms that server honors such `ORDER BY` clause when performing the join. I always thought that there is no guarantee about the order in which the rows are joined (especially taking into account that there are several different types of joins (loop, merge, hash) and server is free to choose any type). – Vladimir Baranov Aug 05 '15 at 01:53
  • Another change to the evaluation of Bucket2 (Using BETWEEN etc). @Vladimir, actually I agree with you but my tests have shown that the ORDER BY does change the behavior of this UPDATE.. E.g adding DESC to the ORDER BY in the sub-query does change the result as the OP expected in his/her test case, consistently. However, this might only be relied upon if the JOIN can be coerced into using a loop. – JohnS Aug 05 '15 at 02:02
  • @JohnS, I, personally, would not risk putting such code into production. Would you? – Vladimir Baranov Aug 05 '15 at 02:07
  • @JohnS, that would have been a great solution, however when I test it out with the original data set it doesn't seem to return the same results. Original cursor: http://sqlfiddle.com/#!6/f3366/3/0 Order by Update: http://sqlfiddle.com/#!6/4b10c/2/0. The SQL looks good to me, so I can only guess that the order by isn't doing the equivalent of the RBAR. – p3pp Aug 05 '15 at 02:24
  • @Vladimir is definitely right. The cursor method is safer than my solution. The only other thing I can think of is separating the type 0 updates from the other types which would probably need a cursor again. Back to square 1. – JohnS Aug 05 '15 at 02:42
  • @JohnS the original cursor is actually something I wrote years ago, and I've always had in the back of my mind that there must be a better way. I'm likely to have the opportunity to re-write the whole process, so figured I'd ask here in case there's something I'm missing. – p3pp Aug 05 '15 at 02:56