1

I have this simple SQL as a source in a SSIS task:

Select * from budgetview

the source is:

CREATE VIEW [dbo].[BudgetView] AS
SELECT   DISTINCT  Country, 
            SDCO AS Company, 
            SDAN8 AS Customer, 
            SDLITM AS PrintableItemNumber, 
            dbo.fn_DateFromJulian(SDIVD) AS Date, 
            SDPQOR/100.0 AS Quantity, 
            SDAEXP/100.0 AS Value, 
            SDITWT/10000.0 AS Weight
FROM         dbo.F553460

There are NO advices for indexes, every thing seems optimized.

The function fn_DateFromJulian source is:

CREATE FUNCTION [dbo].[fn_DateFromJulian] 
(
    @JulianDate numeric(6,0)
)
RETURNS date
AS
BEGIN
    declare @resultdate date=dateadd(year,@JulianDate/1000,'1900-01-01')
    set @resultdate=dateadd(day,@JulianDate%1000 -1,@resultdate)
    return @resultdate

END

The problem is that i am waiting around 20 minutes just to get the rows going in SSIS.... SSIS task

I am waiting there 20mins BEFORE it gets started

Are there any suggestions to find the culprit?

e4rthdog
  • 5,103
  • 4
  • 40
  • 89
  • 1
    Scalars are evil http://sqlblog.com/blogs/paul_white/archive/2012/09/05/compute-scalars-expressions-and-execution-plan-performance.aspx http://sqlblog.com/blogs/rob_farley/archive/2011/11/08/when-is-a-sql-function-not-a-function.aspx Ideal, in my mind, would be to make that into an inline table valued function and then use it as such. Might save you some processing time. The other thing you could do to speed up the first 10k with a hint but seems questionable here http://sqlblog.com/blogs/rob_farley/archive/2011/02/17/the-ssis-tuning-tip-that-everyone-misses.aspx – billinkc Jan 12 '15 at 16:58
  • Can you help me understand your function? Looking at the [wiki article](http://en.wikipedia.org/wiki/Julian_day), I should be able to pass 2451545 into the function and get 2001-01-1 back but the function only accepts 6 digits – billinkc Jan 12 '15 at 17:05
  • No its like this: 110235 is the 235th day of year 2010. I am using this in our ERP , JDEdwards. See this: http://stackoverflow.com/questions/1171208/what-is-the-precise-definition-of-jdes-julian-date-format – e4rthdog Jan 12 '15 at 17:14
  • What is performance like for the same query without the function? – Martin Smith Jan 12 '15 at 19:04
  • The performance strangely reamins the same even when applying the anser below...I will now try to get the date via a lookup – e4rthdog Jan 13 '15 at 11:04
  • 1
    So if my answer doesn't really solve the problem, then don't mark it as such. Let's find out what's going on. Unicorn points are great and all, but let's solve problems. In management Stupdi, hit Ctrl-M to generate an actual query plan and then run your SELECT statement against the view. Right click on the plan, save as e4rthdog.sqlplan and then post to ... sqlperformance.com or github, whatever so that smarter folks than I can see what's happening in the query. Also, are there other data flows in this package or just this one you've shown? – billinkc Jan 13 '15 at 15:02
  • Dont get harsh mate :) i forgot to tell you that i was missing an index and that after creating it , it actually run faster. But i also saw a 20% increase of speed after i replaced the scalar function with your TVF... – e4rthdog Jan 13 '15 at 19:01

2 Answers2

3

My assumption is that the time spent on the view is consumed by calculating the Julian date value. Without seeing the actual query plan, it seems a fair guess based on the articles below.

Rewrite the original function as a table valued function below (I've simply mashed your code together, there are likely opportunities for improvement)

CREATE FUNCTION dbo.fn_DateFromJulianTVF
(
    @JulianDate numeric(6,0)
)
RETURNS TABLE AS
RETURN
(
    SELECT dateadd(day,@JulianDate%1000 -1,dateadd(year,@JulianDate/1000,CAST('1900-01-01' AS date))) AS JDEDate
)

Usage would be

CREATE VIEW [dbo].[BudgetView] AS
SELECT   DISTINCT  Country, 
            SDCO AS Company, 
            SDAN8 AS Customer, 
            SDLITM AS PrintableItemNumber, 
            J.JDEDate AS [Date], 
            SDPQOR/100.0 AS Quantity, 
            SDAEXP/100.0 AS Value, 
            SDITWT/10000.0 AS Weight
FROM         dbo.F553460 AS T
    CROSS APPLY
        dbo.fn_DateFromJulianTVF(T.SDIVD) AS J

Scalar valued function, smell like code reuse, performs like a reused disposable diaper

billinkc
  • 59,250
  • 9
  • 102
  • 159
1

Just checking, but am I right to understand that for every unique value of T.SDIVD there will be just one unique result value of the function ? In other words, no two different T.SDIVD will return the same value from the function?

In that case what is happening here (IMHO) is that you first do scan over the entire table, for each and every record calculate the f(SDIVD) value and then send that entire resultset through an aggregation (DISTINCT).

Since functions are far from optimal in MSSQL I'd suggest to limit their use by turning around the chain of events and doing it like this:

CREATE VIEW [dbo].[BudgetView] AS
SELECT /* DISTINCT */
                Country, 
                Company, 
                Customer, 
                PrintableItemNumber, 
                dbo.fn_DateFromJulian(SDIVD) AS Date, 
                Quantity, 
                Value, 
                Weight
          FROM (

                SELECT DISTINCT Country, 
                                SDCO AS Company, 
                                SDAN8 AS Customer, 
                                SDLITM AS PrintableItemNumber, 
                                SDIVD, 
                                SDPQOR/100.0 AS Quantity, 
                                SDAEXP/100.0 AS Value, 
                                SDITWT/10000.0 AS Weight
                           FROM dbo.F553460 ) dist_F553460
               ) 

If you had lots of double records this should improve performance, if you only had few of them it won't make much of a difference, if any. If you know you have no doubles at all you should get rid of the DISTINCT in the first place as that is what causing the delay!

Anyway, regarding the function you can add the following trick:

CREATE FUNCTION [dbo].[fn_DateFromJulian] 
(
    @JulianDate numeric(6,0)
)
RETURNS date
WITH SCHEMABINDING
AS
BEGIN
    declare @resultdate date=dateadd(year,@JulianDate/1000,'1900-01-01')
    set @resultdate=dateadd(day,@JulianDate%1000 -1,@resultdate)
    return @resultdate

END

The WITH SCHEMABINDING causes some internal optimisations that will make its execution slightly faster, YMMV. There are limitations to it, but here it will work nicely.

Edit: removed the 'outer' DISTINCT since it's (likely, cf my first assumption) not needed.

deroby
  • 5,902
  • 2
  • 19
  • 33