Calculation in Sql Server

Question

I trying to perform following calculation

Sample data:

CREATE TABLE #Table1
  (
     rno   int identity(1,1),
     ccp   varchar(50),
     [col1] INT,
     [col2] INT,
     [col3] INT,
     col4 as [col2]/100.0
  );

INSERT INTO #Table1
            (ccp,[col1],[col2],[col3])
VALUES      ('ccp1',15,10,1100),
            ('ccp1',20,10,1210),
            ('ccp1',30,10,1331),
            ('ccp2',10,15,900),
            ('ccp2',15,15,1000),
            ('ccp2',20,15,1010)

+-----+------+------+------+------+----------+
| rno | ccp  | col1 | col2 | col3 |   col4   |
+-----+------+------+------+------+----------+
|   1 | ccp1 |   15 |   10 | 1100 | 0.100000 |
|   2 | ccp1 |   20 |   10 | 1210 | 0.100000 |
|   3 | ccp1 |   30 |   10 | 1331 | 0.100000 |
|   4 | ccp2 |   10 |   15 |  900 | 0.150000 |
|   5 | ccp2 |   15 |   15 | 1000 | 0.150000 |
|   6 | ccp2 |   20 |   15 | 1010 | 0.150000 |
+-----+------+------+------+------+----------+

Note : It is not just 3 records each ccp can have N no.of records

Expected Result :

1083.500000 --1100 - (15 * (1+0.100000))
1169.850000 --1210 - ((20 * (1+0.100000)) + (15 * (1+0.100000)* (1+0.100000)) )
1253.835000 --1331 - ((30 * (1+0.100000)) + (20 * (1+0.100000)* (1+0.100000)) + (15 * (1+0.100000)* (1+0.100000) *(1+0.100000)) )
888.500000  --900 - (10 * (1+0.150000))
969.525000  --1000 - ((15 * (1+0.150000)) + (10 * (1+0.150000)* (1+0.150000)) )
951.953750  --1010 - ((20 * (1+0.150000)) + (15 * (1+0.150000)* (1+0.150000)) + (10 * (1+0.150000)* (1+0.150000) *(1+0.150000)) )

I know we can do this using Recursive CTE, it is not efficient since i have to do this for more than 5 million records.

I am looking to implement something like this set based approach

For ccp : ccp1

SELECT col3 - ( col1 * ( 1 + col4 ) )
FROM   #Table1
WHERE  rno = 1

SELECT rno,
       col3 - ( ( col1 * Power(( 1 + col4 ), 1) ) + ( Lag(col1, 1)
                                                        OVER(
                                                          ORDER BY rno ) * Power(( 1 + col4 ), 2) ) )
FROM   #Table1
WHERE  rno IN ( 1, 2 )

SELECT rno,
       col3 - ( ( col1 * Power(( 1 + col4 ), 1) ) + ( Lag(col1, 1)
                                                        OVER(
                                                          ORDER BY rno ) * Power(( 1 + col4 ), 2) ) + ( Lag(col1, 2)
                                                                                                          OVER(
                                                                                                            ORDER BY rno ) * Power(( 1 + col4 ), 3) ) )
FROM   #Table1
WHERE  rno IN ( 1, 2, 3 )

Is there a way to calculate in single query?

Update :

Still am open to suggestions. I strongly beleive there should be some to do this using SUM () Over(Order by) window aggregate function.

You have 5 million records, does it mean that on Id 4, you will add id 3, 2, 1 and on Id 10 you will add 9, 8 ..., 3, 2, 1? Or you keep going with 3 consecutive? — Veljko89, Jan 23 '17 at 10:28
It's awesome problem ... but as I can see you need running total here, i'm trying to make it work, but will take some time as I am on work right now. but check running total maybe you manage before me — Veljko89, Jan 23 '17 at 11:10
I doubt you will find any way of doing it with `SUM () Over(Order by)` - all that does is get the running total of an expression. Not allow you to manipulate the previous running total and then use the result as the new running total going forward. — Martin Smith, Jan 25 '17 at 21:30
The same options as general running totals before `SUM () Over(Order by)` was introduced I think. Triangular join, recursive CTE, quirky update, cursor, iterative CLR. — Martin Smith, Jan 26 '17 at 07:40

score 30 · Answer 1 · answered Jan 23 '17 at 11:46

Finally I achieved the result using below approach

SELECT a.*,
       col3 - res AS Result
FROM   #TABLE1 a
       CROSS apply (SELECT Sum(b.col1 * Power(( 1 + b.COL2 / 100.00 ), new_rn)) AS res
                    FROM   (SELECT Row_number()
                                     OVER(
                                       partition BY ccp
                                       ORDER BY rno DESC) new_rn,*
                            FROM   #TABLE1 b
                            WHERE  a.ccp = b.ccp
                                   AND a.rno >= b.rno)b) cs

Result :

+-----+------+------+------+------+----------+-------------+
| rno | ccp  | col1 | col2 | col3 |   col4   |   Result    |
+-----+------+------+------+------+----------+-------------+
|   1 | ccp1 |   15 |   10 | 1100 | 0.100000 | 1083.500000 |
|   2 | ccp1 |   20 |   10 | 1210 | 0.100000 | 1169.850000 |
|   3 | ccp1 |   30 |   10 | 1331 | 0.100000 | 1253.835000 |
|   4 | ccp2 |   10 |   15 |  900 | 0.150000 | 888.500000  |
|   5 | ccp2 |   15 |   15 | 1000 | 0.150000 | 969.525000  |
|   6 | ccp2 |   20 |   15 | 1010 | 0.150000 | 951.953750  |
+-----+------+------+------+------+----------+-------------+

Well done! Can't remember last time I seen cool problem as this one on SO — Veljko89, Jan 23 '17 at 12:09
@Pரதீப் Impressive. Did you rewrite this to use window functions (ORDER BY UNBOUNDED PRECEDING)? Would it improve the readability? This is actually pretty compact and reads easy. The performance (15 to 17 sec you mentioned above) is from this query? — suresubs, Dec 24 '17 at 16:05
@suresubs - Its impossible to write this query using aggregate window functions. Trust me I have done enough research — Pரதீப், Dec 24 '17 at 16:08
NB: You're using `b.COL2 / 100.00` instead of `COL4`. If you have the option to change `COL4` to `PERSISTED` you may get another slight boost. — JohnLBevan, Dec 07 '18 at 16:55

Martin Smith · Answer 2 · 2017-01-28T12:51:14.637

This answer may be disappointing but you'll likely find that an iterative CLR approach performs competitively with any TSQL approach.

Try the following (based on Running sums yet again: SQLCLR saves the day!)

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

public partial class StoredProcedures
{
    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void StackoverflowQuestion41803909()
    {
        using (SqlConnection conn = new SqlConnection("context connection=true;"))
        {
            SqlCommand comm = new SqlCommand();
            comm.Connection = conn;
            comm.CommandText = @"
SELECT [rno],
       [ccp],
       [col1],
       [col2],
       [col3],
       [col4]
FROM   Table1
ORDER  BY ccp,
          rno 
";

            SqlMetaData[] columns = new SqlMetaData[7];
            columns[0] = new SqlMetaData("rno", SqlDbType.Int);
            columns[1] = new SqlMetaData("ccp", SqlDbType.VarChar, 50);
            columns[2] = new SqlMetaData("col1", SqlDbType.Int);
            columns[3] = new SqlMetaData("col2", SqlDbType.Int);
            columns[4] = new SqlMetaData("col3", SqlDbType.Int);
            columns[5] = new SqlMetaData("col4", SqlDbType.Decimal, 17, 6);
            columns[6] = new SqlMetaData("result", SqlDbType.Decimal, 17, 6);

            SqlDataRecord record = new SqlDataRecord(columns);

            SqlContext.Pipe.SendResultsStart(record);

            conn.Open();

            SqlDataReader reader = comm.ExecuteReader();

            string prevCcp = null;
            decimal offset = 0;

            while (reader.Read())
            {
                string ccp = (string)reader[1];
                int col1 = (int)reader[2];
                int col3 = (int)reader[4];
                decimal col4 = (decimal)reader[5];

                if (prevCcp != ccp)
                {
                    offset = 0;
                }

                offset = ((col1 + offset) * (1 + col4));
                record.SetInt32(0, (int)reader[0]);
                record.SetString(1, ccp);
                record.SetInt32(2, col1);
                record.SetInt32(3, (int)reader[3]);
                record.SetInt32(4, col3);
                record.SetDecimal(5, col4);
                record.SetDecimal(6, col3 - offset);

                SqlContext.Pipe.SendResultsRow(record);

                prevCcp = ccp;
            }

            SqlContext.Pipe.SendResultsEnd();
        }
    }
};

score 5 · Answer 3 · answered Jan 23 '17 at 12:57

Another option

CREATE TABLE #Table1
  (
     rno   int identity(1,1),
     ccp   varchar(50),
     [col1] INT,
     [col2] INT,
     [col3] INT,
     col4 as [col2]/100.0
  );

INSERT INTO #Table1
            (ccp,[col1],[col2],[col3])
VALUES      ('ccp1',15,10,1100),
            ('ccp1',20,10,1210),
            ('ccp1',30,10,1331),
            ('ccp1',40,10,1331),
            ('ccp2',10,15,900),
            ('ccp2',15,15,1000),
            ('ccp2',20,15,1010);

select t.*, col3-s
from(
    select *, rn = row_number() over(partition by ccp order by rno)
    from #Table1
) t
cross apply (
    select s=sum(pwr*col1)
    from(
        select top(rn)
           col1, pwr = power(1+col4, rn + 1 - row_number() over(order by rno))
        from #Table1 t2
        where t2.ccp=t.ccp
        order by row_number() over(order by rno)
        )t3
    )t4
order by rno;

Tested it with 3 million records(`50000 ccp's` and `60 rno` for each ccp). Your's took **155-160** seconds. — Pரதீப், Jan 24 '17 at 12:34

score 4 · Accepted Answer · answered Jan 24 '17 at 00:35

An approach with a self join. Not sure if this would be any more efficient than your version with cross apply.

WITH T AS
  (SELECT *,
          ROW_NUMBER() OVER(PARTITION BY CCP
                            ORDER BY RNO) AS RN
   FROM #TABLE1)
SELECT T1.RNO,
       T1.CCP,
       T1.COL1,
       T1.COL2,
       T1.COL3,
       T1.COL3-SUM(T2.COL1*POWER(1+T1.COL2/100.0,T1.RN-T2.RN+1)) AS RES
FROM T T1
JOIN T T2 ON T1.CCP=T2.CCP
AND T1.RN>=T2.RN
GROUP BY T1.RNO,
         T1.CCP,
         T1.COL1,
         T1.COL2,
         T1.COL3

Sample Demo

Tested it with 3 million records(`50000 ccp's` and `60 rno` for each ccp). My query took **15-17** seconds. Your's took **60-65** seconds. — Pரதீப், Jan 24 '17 at 12:33

score 3 · Answer 5 · answered Jan 31 '17 at 21:49

Try this:

;with 
    val as (
        select 
            *, 
            (1 + col2 / 100.00) val,
            row_number() over(partition by ccp order by rno desc) rn
        from #Table1),
res as (
        select 
            v1.rno, 
            --min(v1.ccp) ccp,
            --min(v1.col1) col1, 
            --min(v1.col2) col2, 
            min(v1.col3) col3, 
            sum(v2.col1 * power(v2.val, 1 + v2.rn - v1.rn)) sum_val
        from val v1
        left join val v2 on v2.ccp = v1.ccp and v2.rno <= v1.rno
        group by v1.rno)
select *, col3 - isnull(sum_val, 0)
from res

But performance depends on indexes. Post index structure for details. Best performance can be achieved when you will split it into more temporary tables.

score 2 · Answer 6 · answered Jan 25 '17 at 21:24

After playing with it for some time I believe the answer to the bounty question of whether or not this can be done with a sum() over (order by) is NO. This code is as close as I could get:

select  *, col3 - sum(col1 * power(1 + col4, row_num)) over (partition by ccp order by col1)
from    (
        select  *, row_number() over (partition by ccp order by rno asc) row_num
        from    @Table1
        ) a
order   by 1,2;

This will return correct results for the first row in each ccp group. By calculating row_num using rno desc instead then the final row in each ccp will be correct.

It appears that the only ways to get this to work in the simple way that the syntax suggests would be:

Syntax support to reference the actual row within the aggregate over function. This does exist in T-SQL as far as I can find.
Syntax support for a window function within a window function. This also is not permitted in T-SQL per the following error:

Windowed functions cannot be used in the context of another windowed function or aggregate.

This was an interesting problem. I'd be curious how this solution performs against your large dataset even though the actual result is incorrect.

Calculation in Sql Server

Update :

6 Answers6

Linked