2

I have a bit of a challenge ahead of me with a report I need to write.

I have an ordered selected list of results, which has the following:

+---------+----------+----------+
| Header  | estimate | TargetId |
+---------+----------+----------+
| Task 1  |       80 |        1 |
| Task 2  |       30 |        1 |
| Task 3  |       40 |        2 |
| Task 4  |       10 |        2 |
+---------+----------+----------+

I’d like to join this onto another set of data containing the Target information:

+--------+----------+
| Target | Capacity |
+--------+----------+
|      1 |      100 |
|      2 |       50 |
|      3 |       50 |
+--------+----------+

However I’d like to do some sort of pivot / cross join to fill each target to capacity and report this in a way to show a forecast of when each of the tasks for the target will be met.

+---------+----------+----------+----------+----------+---+---+
| Header  | Overfill | Target 1 | Target 2 | Target 3 | … | … |
+---------+----------+----------+----------+----------+---+---+
| Task 1  | No       |       80 |        0 |        0 | 0 | 0 |
| Task 2  | Yes      |       20 |       10 |        0 | 0 | 0 |
| Task 3  | No       |        0 |       40 |        0 | 0 | 0 |
| Task 4  | Yes      |        0 |        0 |       10 | 0 | 0 |
+---------+----------+----------+----------+----------+---+---+

Alternatively displayed:

+---------+--------+-----------+
| Header  | Target | Overfill% |
+---------+--------+-----------+
| Task 1  | 1      | 0         |
| Task 2  | 1,2    | 33.33     |
| Task 3  | 2      | 0         |
| Task 4  | 3      | 100%      |
+---------+--------+-----------+

The actual set of data will involve a few hundred tasks across 20 – 30 targets, unfortunately I don’t have any code to show as a demonstration, short of the few simple selects, as I’m not sure how to approach the overfill.

I believe this could be achieved through C# easier however I was hoping this could be completed as a pure SP operation so I can return the data as I wish to display it.

Any help or a nudge in the right direction to take would be greatly appreciated, Chris

Chris
  • 915
  • 8
  • 28
  • The fact that you know this is easier in C# is a red flag -- why not do it in C#? – Hogan Nov 01 '18 at 15:28
  • 1
    You can use `dynamic sql` and `pivot` with `FOR XML PATH` to achieve this in `SQL`, possible dublicate (https://stackoverflow.com/questions/10404348/sql-server-dynamic-pivot-query) – Ryan Wilson Nov 01 '18 at 15:30
  • Our users have the ability to run SP's directly from our DB, i'd rather return results as I wish to display. if it's too hard to do in SQL then I will do it in C# and endure that side of it. However I wouldn't say that I have a huge depth of knowledge with SQL to claim that there's not some nice function that could do it – Chris Nov 01 '18 at 15:32
  • @RyanWilson I have done something similar before, but that wouldn't help with the overfill, example above for task 4 targeting 2 but ending up in 3. – Chris Nov 01 '18 at 15:34

1 Answers1

2

Doing this in SQL is a bad idea, but it is possible with a recursive CTE. The solution below uses a recursive CTE with a result set that maintains the state of the solution as it goes. It queries one record for each source for each recursive iteration and updates the state with the results of certain calculations. Depending on the state in the result it will either advance the sequence, the target, or both.

This solution assumes the targets and headers are sequentially ordered. If the targets aren't sequentially ordered, you can use a CTE to add ROW_NUMBER() to targets. Also if you have more than 32767 steps in the solution it will fail as that is the max recursion that sql server supports. Steps should be at most tasks + targets.

One nice thing is that it will handle overflow across multiple targets. For example, if a task has an estimate it that will fill up multiple targets, then the next task will start at the next available bucket, not the assigned one. Go ahead and put some crazy numbers in there.

Finally, I didn't know how you were deriving overflow percentage, I don't know how you got the last row's result from your sample data. I doubt whatever the answer should be would be difficult to derive once the criteria is known.

/** Setup Test Data **/
DECLARE @Tasks TABLE ( Header VARCHAR(20), Estimate INT, TargetId INT );
DECLARE @Targets TABLE ( TargetId INT, Capacity INT );

INSERT INTO @Tasks VALUES 
( 'Task 1', 80, 1 ), ( 'Task 2', 30, 1 ), ( 'Task 3', 40, 2 ), ( 'Task 4', 10, 2 );

INSERT INTO @Targets VALUES ( 1, 100 ), ( 2, 50 ), ( 3, 50 );

/** Solution **/

WITH Sequenced AS (
    -- Added SequenceId for tasks as it feels janky to order by headers.
    SELECT CAST(ROW_NUMBER() OVER (ORDER BY Header) AS INT) [SequenceId], tsk.*
    FROM @Tasks tsk
)
, TargetsWithOverflow AS (
    SELECT *
    FROM @Targets
    UNION
    SELECT MAX(TargetId) + 1, 99999999 -- overflow target to store excess not handled by targets
    FROM @Targets
)
, src AS (
    -- intialize state
    SELECT 0 [SequenceId], CAST('' AS varchar(20)) [Header], 0 [Estimate], 0 [CurrentTargetId]
        , 0 [CurrentTargetFillLevel], 0 [SequenceRemainingEstimate], 0 [OverfillAmt]

    UNION ALL

    SELECT seq.SequenceId, seq.header, seq.Estimate, tgt.TargetId
        , CASE WHEN [Excess] <= 0 THEN TrueFillLevel + TrueEstimate -- capacity meets estimate
            ELSE tgt.Capacity -- there is excess estimate
        END 
        , CASE WHEN [Excess] <= 0 THEN 0 -- task complete
            ELSE [Excess] -- task is not complete still some of estimate is left
        END
        , CASE WHEN tgt.TargetId != seq.TargetId THEN  
            CASE WHEN [Excess] > 0 THEN [TrueEstimate] - [Excess] ELSE [TrueEstimate] END
            ELSE 0 
        END
    FROM src
    INNER JOIN Sequenced seq ON 
        (src.SequenceRemainingEstimate = 0 AND seq.SequenceId = src.SequenceId + 1)
        OR (src.SequenceRemainingEstimate > 0 AND seq.SequenceId = src.SequenceId)
    INNER JOIN TargetsWithOverflow tgt ON 
        -- Part of target selection is based on if the sequence advanced.
        -- If the sequence has advanced then get the target assigned to the sequence 
        -- Or use the current one if it is GTE to the assigned target.
        -- Otherwise get the target after current target.
        (tgt.TargetId = seq.TargetId AND tgt.TargetId > src.CurrentTargetId AND seq.SequenceId != src.SequenceId)
        OR (tgt.TargetId = src.CurrentTargetId AND tgt.Capacity >= src.CurrentTargetFillLevel AND seq.SequenceId != src.SequenceId)
        OR (tgt.TargetId = src.CurrentTargetId + 1 AND seq.SequenceId = src.SequenceId)
    CROSS APPLY (
        SELECT CASE WHEN tgt.TargetId != src.CurrentTargetId THEN 0 ELSE src.CurrentTargetFillLevel END [TrueFillLevel] 
    ) forFillLevel
    CROSS APPLY (
        SELECT tgt.Capacity - [TrueFillLevel] [TrueCapacity]
    ) forCapacity
    CROSS APPLY (
        SELECT CASE WHEN src.SequenceRemainingEstimate > 0 THEN src.SequenceRemainingEstimate ELSE seq.Estimate END [TrueEstimate]
    ) forEstimate
    CROSS APPLY (
        SELECT TrueEstimate - TrueCapacity [Excess]
    ) forExcess
)
SELECT src.Header
    , LEFT(STUFF((SELECT ',' + RTRIM(srcIn.CurrentTargetId)
            FROM src srcIn
            WHERE srcIn.Header = src.Header
            ORDER BY srcIn.CurrentTargetId
            FOR XML PATH(''), TYPE).value('.', 'varchar(max)'), 1, 1, ''), 500)
    [Target] 
    , CASE WHEN SUM(OverfillAmt) > 0 THEN 'Yes' ELSE 'No' END [Overfill]
    , SUM (OverfillAmt) / (1.0 * AVG(seq.Estimate)) [OverfillPct]
FROM src
INNER JOIN Sequenced seq ON seq.SequenceId = src.SequenceId
WHERE src.SequenceId != 0
GROUP BY src.Header
OPTION (MAXRECURSION 32767)

Output

Header               Target     Overfill OverfillPct
-------------------- ---------- -------- ----------------
Task 1               1          No       0.00000000000000
Task 2               1,2        Yes      0.33333333333333
Task 3               2          No       0.00000000000000
Task 4               2,3        Yes      1.00000000000000

I just re-read your question and realized that you intend to run this query within a Stored Procedure. If that's the case, you could use techniques from this method and adapt them in a solution that uses a cursor. I hate them, but I doubt it would work any worse than this solution, and wouldn't have the recursion limitation. You'd just store the results into a temp table or table variable and then return the result of the stored procedure from that.

Daniel Gimenez
  • 18,530
  • 3
  • 50
  • 70
  • Thanks so much for approaching this with a really nice example of the solution. I have a minor issue with it which I was hoping you might be able to explain or fix. In your data Task 4 is pointing at target 3 where in actuality that would be target 2 and i'd expect it to overfill into 3 by 100%. If I change your data Task 4 is no longer returned in the results. - Appreciate any help on this – Chris Nov 02 '18 at 12:08
  • The join to TargetsWithOverflow was tricky - I had mistakenly used a Great Than when I needed GTE. Had I copied your source example correctly I would've noticed that. I updated my answer using the source example data and corrected the error. – Daniel Gimenez Nov 05 '18 at 16:07
  • Ah, figures it would be something simple, really appreciate the help and i'll spend the next week or so learning how this actually works. – Chris Nov 06 '18 at 11:43
  • Hi Daniel, I am trying to modify your SQL so that there are two target ids: one generated from a RowNumber and one from the actual TargetId this is so that even if a target is deleted the sql still works, currently if the targets go 1,3,4 only 2 tasks are picked up and the results are wrong. I think it is todo with "src.CurrentTargetId + 1" where adding 1 to an Id is quite flaky but if this was replaced with a row number it would be fine. Any help would be appreciated - Chris – Chris Nov 13 '18 at 14:11