Need to make a query more efficient

Question

I have a query which I need to make more efficient.

I am breaking it down into sections to see where the efficiency floors are, I currently have a few Nested Select statements, are these a performance problem?

Here is an example of one of them:

SELECT AgreementID,
       DueDate,
       UpdatedAmountDue AS AmountDue,
       COALESCE((SELECT SUM(UpdatedAmountDue)
                 FROM RepaymentBreakdown AS B
                 WHERE CONVERT(datetime, CONVERT(varchar, DueDate, 103), 103) <= 
                       CONVERT(datetime, CONVERT(varchar, R.DueDate, 103), 103)
                 AND B.AgreementID = R.AgreementID),0) AS DueTD,
       RN=ROW_NUMBER() OVER (Partition BY R.AgreementID ORDER BY DueDate)
FROM RepaymentBreakdown AS R

Is there a more clean and efficient way of getting the data of DueTD?

Basically, for each line of a repayment schedule result, I want to get:

AgreementID,
DueDate,
AmountDue,
AmountDueToDate (DueTD)
RowNumber.

The table I am querying is structured as follows:

AgreementID (int), 
DueDate (datetime),
AmountDue (decimal(9,2)),
UpdatedAmountDue (decimal(9,2))*

*UpdatedAmountDue is always referenced as it is the moving figure, AmountDue is always fixed, as a reference value.

It looks like you can find useful tips here http://stackoverflow.com/questions/860966/calculate-a-running-total-in-sqlserver/13744550#13744550 also, what dates do you have in your `DueDate` column? is (AgreementID , DueDate) unique? — Roman Pekar, Jul 03 '13 at 08:52
Hi Roman, thanks for the Link, I will take a look at that... Re my data, `DueDate` contains dates formatted: > '2012-08-24 00:00:00.000', Agreement ID & Due Date are always unique as an agreement can only have 1 payment per day. — Richard Gale, Jul 03 '13 at 08:57
last question - is your DueDate sequential or it could have gaps? — Roman Pekar, Jul 03 '13 at 08:58
There will be gaps in the dates, varying from 1 week up to 1 month. — Richard Gale, Jul 03 '13 at 08:58

score 1 · Accepted Answer · edited May 23 '17 at 10:25

So, I think you could get performance boost just by removing convert, like this:

select
    AgreementID,
    DueDate,
    UpdatedAmountDue as AmountDue,
    (
        select sum(B.UpdatedAmountDue)
        from RepaymentBreakdown as B
        where B.DueDate <= R.DueDate and B.AgreementID = R.AgreementID
    ) as UpdatedAmountDue 
from RepaymentBreakdown AS R

The fastest way I know to calculate running total in SQL Server 2008 would be to use recursive CTE, see my answer here Calculate a Running Total in SqlServer. In your case the query would be smth like this:

create table #t (....., primary key (AgreementID, ord))

insert into #t (AgreementID, DueDate, UpdatedAmountDue, ord)
select AgreementID, DueDate, UpdatedAmountDue, row_number() over (partition by AgreementID, DueDate order by DueDate asc)

;with 
CTE_RunningTotal
as
(
    select T.ord, T.AgreementID, T.DueDate, T.UpdatedAmountDue as T.AmountDue, T.UpdatedAmountDue
    from #t as T
    where T.ord = 1
    union all
    select T.ord, T.AgreementID, T.DueDate, T.UpdatedAmountDue as T.AmountDue, T.UpdatedAmountDue + C.UpdatedAmountDue as UpdatedAmountDue
    from CTE_RunningTotal as C
        inner join #t as T on T.ord = C.ord + 1 and T.AgreementID = C.AgreementID
)
select AgreementID, DueDate, AmountDue, UpdatedAmountDue
from CTE_RunningTotal as C
option (maxrecursion 0)

score -1 · Answer 2 · answered Jul 02 '13 at 13:45

-1

Your conversion of the datetime to a date has several issues.

First, it is not guaranteed to always produce correct results depending on your servers language settings. If you need to do String manipulation on a datetime value always use CONVERT(,,126).

But more importantly, it prevents index usage. Instead use CAST(DueDate AS DATE) as the optimizer recognizes that conversion to be index-safe.

Afterwards you might want to add an index on AgreementId,DueDate and either INCLUDE UpdatedAmountDue or better make it clustered.

Assuming UpdatedAmountDue cannot be NULL, you can get rid of the COALESCE too, as the sum always includes the current row.

answered Jul 02 '13 at 13:45

Sebastian Meine

11,260
29
41

OK, so in terms of the nested Select, will it cause performance issues? – Richard Gale Jul 02 '13 at 14:12
Most likely not. It is a single value correlated subquery and SQL Server should treat it as such. You could change in by using a `CROSS APPLY` instead but most likely you won't need to. – Sebastian Meine Jul 02 '13 at 15:11

Need to make a query more efficient

2 Answers2