2

The following insert query is taking 7 seconds to run, and since it runs multiple times in a loop, the time is adding up. I need help to speed this up, 90% of the cost goes to a clustered index insert and 10% to a table scan of FN_qryPSLA().

INSERT INTO tblTPS (fldPK, fldDTA, fldCI, fldMN, fldMDN, fldIQ, fldSD, fldNDS, fldNIN, fldNL, fldMin, fldMax, fldUNC, fldAVA, fldBA)
SELECT fldPK, fldDTA, fldCI, fldMN, fldMDN, fldIQ, fldSD, fldNDS, fldNIN, fldNL, fldMin, fldMax, fldUNC, fldAVA, 1 AS fldBA
FROM FN_qryPSLA()
WHERE (((fldPK)= 37923 ) AND ((fldMN)<>-254));

The following is FN_qryPSLA()... Cost is 85% on Distinct Sort (if removed the cost is 80% Hash Match (Aggregate))

FUNCTION [dbo].[FN_qryPSLA](@PK INT)
RETURNS @tmpTblPSLA TABLE (
fldPK       BIGINT      NOT NULL,           
fldDTA          DATETIME    NULL,
fldCI       FLOAT       NULL,   
fldMN       FLOAT       NULL,
fldMDN      FLOAT       NULL,
fldIQ       FLOAT       NULL,               
fldSD       FLOAT       NULL,
fldNDS      BIGINT      NULL,
fldNIN      BIGINT      NULL,
fldNL       BIGINT      NULL,
fldMin      FLOAT       NULL,
fldMax      FLOAT       NULL,
fldUNC      VARCHAR(5)  NULL, 
fldAVA      TINYINT     NULL
) 
AS
BEGIN 
WITH gDPS AS 
(SELECT fldPK, Max(fldDTA) AS fldDTA
FROM tblAPS
GROUP BY fldPK)

    SELECT fldPK, fldDTA, fldCI, fldMN, fldMDN, fldIQ, fldSD, fldNDS, fldNIN, fldNL, fldMin, fldMax, fldUNC, fldAVA
FROM tblAPS INNER JOIN getDPS ON (tblAPS.fldDTA = gDPS.fldDTA) AND (tblAPS.fldPK = gDPS.fldPK)
    GROUP BY fldPK, fldDTA, fldCI, fldMN, fldMDN, fldIQ, fldSD, fldNDS, fldNIN, fldNL, fldMin, fldMax, fldUNC, fldAVA
    HAVING tblAPS.fldMN<>-254
    ORDER BY tblAPS.fldPK, tblAPS.fldCI;
    RETURN; 
END;

Fields involved (2 million records in the table being queried) Clustered Index on fldPSD...

fldPK           BIGINT      NOT NULL,           
fldDTA  DATETIME    NULL,
fldCI       FLOAT       NULL,   
fldMN           FLOAT       NULL,
fldMDN          FLOAT       NULL,
fldIQ               FLOAT       NULL,               
fldSD               FLOAT       NULL,
fldNDS      BIGINT      NULL,
fldNIN          BIGINT      NULL,
fldNL           BIGINT      NULL,
fldMin              FLOAT       NULL,
fldMax              FLOAT       NULL,
fldUNC      VARCHAR(5)  NULL, 
fldAVA      TINYINT     NULL
codingManiac
  • 1,654
  • 4
  • 29
  • 42
  • Show us the *whole* definition of `FN_qryPSLA()`. You have withheld the definition's header, which happens to be very important for this kind of performance problem. – RBarryYoung Oct 09 '13 at 23:14
  • @RBarryYoung I edited the post to include the rest of the function. – codingManiac Oct 10 '13 at 11:18
  • I eventually got to SQL suggesting this index...CREATE NONCLUSTERED INDEX [$fldPKMdnInd] ON [dbo].[tblAPS] ([fldPK],[fldMDN]) INCLUDE ([fldDTA],[fldCI],[fldMDN],[fldIQ],[fldSD],[fldNDS],[fldNIN],[fldNL],[fldMin],[fldMax],[fldUNC],[fldAVA]) GO – codingManiac Oct 10 '13 at 11:27
  • Here's what I would look at: "*The following insert query is taking 7 seconds to run, and since it runs multiple times in a loop, the time is adding up.*" Loops are not a desirable thing in SQL since they are the deconstructed pieces of a query/set that usually could be faster and simpler. – RBarryYoung Oct 10 '13 at 12:03

1 Answers1

6

This snippet stood out to me:

90% of the cost goes to a clustered index insert

With that in mind, I can think of two things to try:

  1. Changing the Fill Factor for the target table to allow some empty space in each page for the incoming data.
    Or:
  2. Ordering the data before insert, so that it matches the primary key order.

The goal of both of these is to reduce the need for re-paging the data and indexes during the insert.

In the first case, you are reserving empty space in each page at the creation time, such that as you insert records you don't need to split or create new pages as often.

In the second case, you can't use an ORDER BY with a SELECT command that is part of an insert... but you can influence the order through the indexes on the tables in that SELECT command. The goal is that by feeding the data in order, you'll be working in one page at a time in sequence while inserting data, rather than jumping around and possibly needing to rebuild the same page over and over.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • The target table actually only holds temporary data and not a whole lot, so I decided to remove the indexes. Now the cost is 75% insert into tblTPS, is there anyway to cut this cost down? – codingManiac Oct 09 '13 at 21:40
  • No, and you wouldn't want to. Remember, you're seeing percentages here: the number always has to add up to 100. What you're seeing there is the raw insert performance: actually inserting the data. 75% there isn't bad. What you can do now is get more memory or faster disks, or reduce the other 25%. – Joel Coehoorn Oct 09 '13 at 21:46
  • The other 25% goes towards scanning the subquery, and the subquery has 85% towards distinct ordering, but at that point I'm really probably only going to squeeze out ms. – codingManiac Oct 09 '13 at 21:48
  • I just put the fldPK condition into the subquery and cut it down to 2 seconds at least – codingManiac Oct 09 '13 at 21:51
  • @Joel, in your last paragraph, you say, "you can't use an ORDER BY with a SELECT command that is part of an insert" ... do you mean an INSERT INTO ... SELECT * FROM ... ? How do you order the the data without the ORDER BY clause? – JustBeingHelpful Jan 25 '14 at 08:48
  • This says you cannot. http://stackoverflow.com/questions/10064532/the-order-of-a-sql-select-statement-without-order-by-clause – JustBeingHelpful Jan 25 '14 at 08:51
  • This says you cannot as well. http://blogs.msdn.com/b/conor_cunningham_msft/archive/2008/08/27/no-seatbelt-expecting-order-without-order-by.aspx?Redirected=true – JustBeingHelpful Jan 25 '14 at 09:03
  • 1
    @MacGyver You can't guarantee the order, but you can strongly influence it through use of indexes – Joel Coehoorn Jan 27 '14 at 01:43