1

In the stored procedure (I'm using SQL server2008), I'm having a business like this sample:

ID  City    Price   Sold    
1   A         10    3
1   B         10    5
1   A         10    1
1   B         10    3
1   C         10    5
1   C         10    2

2   A         10    1
2   B         10    6
2   A         10    3
2   B         10    4
2   C         10    3
2   C         10    4

What I want to do is:

  • with each ID, sort by City first.

  • After sort, for each row of this ID, re-calculate Sold from top to bottom with condition: total of Sold for each ID does not exceed Price (as the result below).

And the result like this:

ID  City    Price   Sold_Calculated 
1   A         10    3
1   A         10    1
1   B         10    5
1   B         10    1 (the last one equal '1': Total of Sold = Price)
1   C         10    0 (begin from this row, Sold = 0)
1   C         10    0

2   A         10    1
2   A         10    3
2   B         10    6
2   B         10    0 (begin from this row, Sold = 0)
2   C         10    0
2   C         10    0

And now, I'm using the Cursor to do this task: Get each ID, sort City, calculate Sold then, and save to temp table. After finish calculating, union all temp tables. But it take a long time.

What I know people advise is, DO NOT use Cursor.

So, with this task, can you give me the example (with using select form where group) to finish? or do we have other ways to solve it quickly?

I understand this task is not easy for you, but I still post here, hope that there is someone helps me to go through.

I'm very appriciated for your help.

Thanks.

Noel
  • 10,152
  • 30
  • 45
  • 67
Lang thang
  • 281
  • 1
  • 2
  • 11
  • 4
    SQL tables are inherently unordered. How are you determining the order of the rows after you sort by city? Also, what database are you using? – Gordon Linoff Aug 14 '13 at 03:11
  • I wonder if this is only for back-end and have no front-end (i.e. php, .Net etc)? – Edper Aug 14 '13 at 03:11
  • brother, i dont really understand ur question(s), would u mind post the sample result that u want? – Low Chee Mun Aug 14 '13 at 03:22
  • @GordonLinoff, I added more comment to the question: "- After sort, re-calculate Sold from top to bottom (for each row) with condition: total of Sold not exceed Price (as the result below).". And I'm using SQL server2008. – Lang thang Aug 14 '13 at 07:49
  • @Edper, yes, I'm using the stored procedure with Cursor now. – Lang thang Aug 14 '13 at 07:51
  • @CheemunLow, as my quesion, I've already posted the result. Please recheck. – Lang thang Aug 14 '13 at 07:52
  • 2
    There are two problems here essentially: 1) how to return a column value conditionally and 2) how to calculate running totals, because a running total is what you'll need to check on in #1's condition. Apparently the first one is trivial, as for the second one, it's been asked here [plenty of times](http://www.google.com/cse?cx=018205968162215846785%3A7n6ajnwyz-i&ie=UTF-8&q="sql+server"+running+totals). In light of all that, I'm voting to close this as a duplicate of a past running total question of my choice: http://stackoverflow.com/questions/11310877/calculate-running-total-running-balance – Andriy M Aug 15 '13 at 12:31
  • @AndriyM actually your analysis isn't complete on #1. It's how to return a column conditionally based upon a previous row. – Conrad Frix Aug 15 '13 at 15:55
  • @ConradFrix: Depends on how you view it. You could simply compare `Price` to a running total at every point. If the running total exceeds the `Price`, return 0, otherwise return `Sold`. – Andriy M Aug 15 '13 at 16:09
  • @AndriyM Nope. Look at the 4th record in the desired output. It changed the sold from 3 to 1, not simpy 0. This is because the previous running total was 9. Leaving 1 left over for the next record – Conrad Frix Aug 15 '13 at 16:12
  • @ConradFrix: I see and I did miss that, thanks. Still, there's no need to peek at the previous row's value, you can instead cap the (current) `Sold` value at `Price - RunningTotal`, defaulting to 0 if the diff is negative. – Andriy M Aug 15 '13 at 16:17
  • @AndriyM sorry I don't follow. I'm very interested to see how that would be done. Can you demo that. – Conrad Frix Aug 15 '13 at 16:25
  • @ConradFrix: http://sqlfiddle.com/#!6/f06f1/57 Not the exact implementation of what I said, but still no need to self-join specifically to access the previous row (a self-join to get the running totals is unavoidable, of course, unless one chose to use other methods). – Andriy M Aug 15 '13 at 16:44
  • @AndriyM Nice. I'll probably update my answer to inlcude Aaron's recursive CTE solution and your case statement – Conrad Frix Aug 15 '13 at 17:36
  • @ConradFrix: Yes, I think it makes perfect sense to have those under one roof. – Andriy M Aug 15 '13 at 17:47
  • @AndriyM Done and I marked it as community wiki. – Conrad Frix Aug 15 '13 at 18:55

4 Answers4

3

In order to accomplish your task you'll need to calculate a running sum and use a case statement

Previously I used a JOIN to do the running sum and Lag with the case statement

However using a recursive Cte to calculate the running total as described here by Aaron Bertand, and the case statement by Andriy M we can construct the following, which should offer the best performance and doesn't need to "peek at the previous row"

WITH cte 
     AS (SELECT Row_number() 
                  OVER ( partition BY id ORDER BY id, city, sold DESC) RN, 
                id, 
                city, 
                price, 
                sold 
         FROM   table1), 
     rcte 
     AS ( 
        --Anchor 
        SELECT rn, 
               id, 
               city, 
               price, 
               sold, 
               runningTotal = sold 
        FROM   cte 
        WHERE  rn = 1 
         --Recursion 
         UNION ALL 
         SELECT cte.rn, 
                cte.id, 
                cte.city, 
                cte.price, 
                cte.sold, 
                rcte.runningtotal + cte.sold 
         FROM   cte 
                INNER JOIN rcte 
                        ON cte.id = rcte.id 
                           AND cte.rn = rcte.rn + 1) 
SELECT id, 
       city, 
       price, 
       sold, 
       runningtotal, 
       rn, 
       CASE 
         WHEN runningtotal <= price THEN sold 
         WHEN runningtotal > price 
              AND runningtotal < price + sold THEN price + sold - runningtotal 
         ELSE 0 
       END Sold_Calculated 
FROM   rcte 
ORDER  BY id, 
          rn; 

DEMO

Community
  • 1
  • 1
Conrad Frix
  • 51,984
  • 12
  • 96
  • 155
  • hi Conrad Frix, your demo is correct. That's what I want. But I cannot run your query in my SQL server 2008 because of missing built-in function 'Lag'. Could you please find another way to solve my task? – Lang thang Aug 14 '13 at 08:16
  • your new approach is great. Thank to you, Andriy M a lot. – Lang thang Aug 19 '13 at 09:26
1

As @Gordon Linoff commented, the order of sort is not clear from the question. For the purpose of this answer, I have assumed the sort order as city, sold.

select id, city, price, sold, running_sum,
       lag_running_sum,
       case when running_sum <= price then Sold
            when running_sum > price and price > coalesce(lag_running_sum,0) then price - coalesce(lag_running_sum,0)
            else 0
       end calculated_sold
       from
       (
        select id, city, price, sold,
               sum(sold) over (partition by id order by city, sold
                               rows between unbounded preceding and current row) running_sum,
               sum(sold) over (partition by id order by city, sold
                               rows between unbounded preceding and 1 preceding) lag_running_sum
          from n_test
       ) n_test_running
 order by id, city, sold;

Here is the demo for Oracle.

Let me break down the query.

I have used SUM as analytical function to calculate the running sum.

  1. The first SUM, groups the rows based on id, and in each group orders the row by city and sold. The rows between clause tell which rows to be considered for adding up. Here i have specified it to add current row and all other rows above it. This gives the running sum.
  2. The second one does the same thing except for, the current row is excluded from adding up. This essentially creates a running sum but lagging the previous sum by one row.

Using this result as inline view, the outer select makes use of CASE statement to determine the value of new column.

  1. As long as the running sum is less than or equal to price it gives sold.
  2. If it crosses the price, the value is adjusted so that sum becomes equal to price.
  3. For the rest of the rows below it, value is set as 0.

Hope my explanation is quite clear.

Noel
  • 10,152
  • 30
  • 45
  • 67
  • Thank Ramblin very much. But I cannot run your query by the error: "Incorrect syntax near 'rows'." – Lang thang Aug 14 '13 at 08:06
  • Sorry for that. Looks like SQL Server 2008 doesn't support SUM as analytical function. However sql server 2012 does. This is the reason why should always tag the question with specific RDBMS you are using. – Noel Aug 14 '13 at 08:16
  • @Ramblin'Man SQL Server 2008 does support `SUM() OVER` but not the `rows between unbounded preceding and current row`. Still a nice answer. – Conrad Frix Aug 14 '13 at 15:06
0

To me, it sounds like you could use window functions in a case like this. Is this applicable?

Although in my case your end result would possibly look like:

ID  City Price Sold_Calculated
2   A    10    4
2   B    10    6
2   C    10    0

Which could have an aggregation like

SUM(Sold_Calculated) OVER (PARTITION BY ID, City, Price, Sold_Calculated) 

depending on how far down you want to go.. You could even use a case statement if need be

Brent
  • 578
  • 3
  • 12
  • When you use SUM an then partition by unique values you won't get a meaningful sum. – Conrad Frix Aug 14 '13 at 04:31
  • I don't understand. I think the sample I give is very clear. Why a lot of people don't understand. I added some detail, please review it. Thank you very much. – Lang thang Aug 14 '13 at 08:00
0

Are you looking to do this entirely in SQL? A simple approach would be this:

SELECT C.ID,
       C.City,
       C.Price,
       calculate_Sold_Function(C.ID, C.Price) AS C.Sold_Calculated
FROM CITY_TABLE C
GROUP BY C.City

Where calculate_Sold_Function is a T-SQL/MySQL/etc function taking the ID and Price as parameters. No idea how you plan on calculating price.

Phillip Carter
  • 4,895
  • 16
  • 26