row_number and count very slow in query with dynamic sql and CTE, how to implement skip and take

Question

I have the following scenario: a table with up to 20000 entries and another table with corresponding custom fields. I need to implement a query with a filter opportunity over all colums (including the custom fields) AND skip and take AND I need the total row count after filtering. With the help of dynamic sql I managed to implement a query which adds the custom fields as columns to the first table. But I am really having troubles implementing a skip and take functionality which runs really fast AND also returns the total row count. As not all of our customers are running on SQL Server 2012, the best way would be to implement the take and skip via a where clause over the row_number, but i think this also is the slowest alternative. I like the OFFSET and FETCH functionality for the 2012 case, but still getting the row count seems to slow down a lot.

this is my query resulting from the dynamic sql, below the whole dynamic query, both with the two alternatives as comments

With tempTable AS 
(
SELECT *
 --1. ALTERNATIVE:
,ROW_NUMBER() Over (ORDER BY  Nachname, Vorname desc) As  ROW_NUMBER ,COUNT(1) OVER () as Total_Rows
FROM
(
    SELECT  [a].*, [DFSF].[Datenfeld_Name],[Datenfeld_Inhalt]
    FROM 
    [dbo].[Ansprechpartner] AS a left join [dbo].[Datenfeld] AS dfe
    ON [a].[Id] = [dfe].[Datenfeld_AnsprechpartnerID]
    left join
    [Datenfeld_Standardfelder] AS DFSF 
    ON dfe.[StandardfeldID] = [DFSF].[id] and datenfeld_kategorie = 'Ansprechpartner'   

) AS j
PIVOT
(
  max([Datenfeld_Inhalt]) FOR [j].[Datenfeld_Name]  IN ([Medium],[Kontaktthema],[Mediengattung],[Medienthema],[E-Mail],[Homepage],[Rolle])
) AS p
)
SELECT *
--2. ALTERNATIVE:
--,COUNT(1) OVER ()
FROM tempTable
 WHERE 1=1
 -- 1. ALTERNATIVE:
and Row_Number BETWEEN 0 AND 100
ORDER BY  Nachname, Vorname DESC
--2. ALTERNATIVE:
OFFSET 0 ROWS FETCH NEXT 100 ROWS ONLY
 ;

and following the whole dynamic query. I actually think, there is an error in it, because like this I won't get the correct row count, I would probably have to call it once again without the row_number filter to get it right...

DECLARE @filterExpression nvarchar(MAX)
DECLARE @showOnlyDoublets int
DECLARE @sortExpression nvarchar(MAX)
 DECLARE @skip AS int
 DECLARE @take AS [int]
 SELECT @skip = 0
 SELECT @take = 100
--SELECT @filterExpression = 'WHERE Vorname like ''%luc%'''
SELECT @filterExpression = ' WHERE 1=1'
SELECT @sortExpression = 'ORDER BY  Nachname, Vorname desc'
SELECT @showOnlyDoublets = 0

DECLARE @idList nvarchar(MAX)
select @idList =  COALESCE(@idList + '],[', '[') + [DFSF].[Datenfeld_Name] from
[Datenfeld_Standardfelder] AS DFSF 
where datenfeld_kategorie = 'Ansprechpartner'

SELECT @idList = @idList +']'
--SELECT @idList

DECLARE @sqlToRun nvarchar(max)
SET @sqlToRun = 
'With tempTable As 
(
SELECT *
, ROW_NUMBER() Over (' + @sortExpression + ') As  Row_Number 
FROM
(
    SELECT  [a].*, [DFSF].[Datenfeld_Name],[Datenfeld_Inhalt]--, CAST( ROW_NUMBER()  OVER(ORDER BY [DFSF].[Datenfeld_Name] DESC) AS varchar(20)) 
    FROM 
    [dbo].[Ansprechpartner] AS a left join [dbo].[Datenfeld] AS dfe
    ON [a].[Id] = [dfe].[Datenfeld_AnsprechpartnerID]
    left join
    [Datenfeld_Standardfelder] AS DFSF 
    ON dfe.[StandardfeldID] = [DFSF].[id] and datenfeld_kategorie = ''Ansprechpartner'' 

) AS j
PIVOT
(
  max([Datenfeld_Inhalt]) FOR [j].[Datenfeld_Name]  IN (' + @idList + ')
) AS p
)
SELECT *,  COUNT(*) OVER () as Total_Rows FROM tempTable
' + @filterExpression +  '
AND Row_Number BETWEEN ' + CAST ( @skip AS varchar ) + ' AND ' + CAST ( @take AS varchar ) + '
' + @sortExpression + '
--OFFSET ' + CAST ( @skip AS varchar ) + ' ROWS FETCH NEXT ' + CAST ( @take AS varchar ) + ' ROWS ONLY
 ;'


PRINT @sqlToRun
EXECUTE sp_executesql @sqlToRun

So my question is: is there a way to improve this query (one of the two alternatives)? Or do you have a totally different idea, because I think, either way, if I call the count correctly, it will cost a lot of time.

score 1 · Answer 1 · answered Jan 29 '15 at 18:21

Friede,

Take at look at this stored proc below. The comments I have added can hopefully help.

It offers a total row count for the entire query regardless of how many records are being returned per page or what page the user is on. What is probably disturbing your number of rows is that the row numbers are part of the where clause in your code. See in this example how to return subsets without having to handle that logic in the where clause.

Your performance issues are probably related to indexing and pivoting. I see you are filtering a column by arbitrary text input. You'll need that column indexed with all the other returned columns included. An alternative that might speed everything up is described in the query below in which you first find all the ID's that match your predicate (with a non-clustered index on the searched column and including the primary key ID column) then you join the actual query to that table. This limits the entire operation, including the pivot, to only the rows that match.

CREATE PROC [api].[cc_pcp_member_search] 
(
     @string VARCHAR(255)    
    ,@plan_long_id BIGINT 
    ,@num_rows BIGINT
    ,@page_num BIGINT
    ,@order_by VARCHAR(255) = 'last_name'
    ,@sort_order VARCHAR(10) = 'ASC'
)

AS 
SET NOCOUNT ON

DECLARE @MemberList AS TABLE (MemberId INT)
INSERT INTO @MemberList

/* What is happening in the EXEC statement below, is that I am running a text search against
very specific indexes that only return the ID's(clustered) of the items I care about.
This has proven to be far more performant than applying the text predicate to the entire
set due to indexing reasons. The 'real' query further below that brings back all the needed fields
uses this list of ID's to define the scope of the set. You might consider something similar if 
you are querying aginst arbitrary text values.
*/
EXEC dbo.uspGetMemberIdForMemberSearch @string 

/* The table variable does slow things down and you might not need it at all.
In this case, the developers were using Entity Framework and it was having 
trouble determining the signature and data types of the stored procedure without
a typed result set
*/
DECLARE @ResultSet AS TABLE (   member_id BIGINT,
                first_name VARCHAR(255),
                last_name VARCHAR(255),
                full_name VARCHAR(255),
                hicn VARCHAR(255),
                gender VARCHAR(255),
                plan_id BIGINT,
                plan_name VARCHAR(255),
                phone_1 VARCHAR(255),
                phone_2 VARCHAR(255),
                dob DATETIME,
                total_records INT)

INSERT INTO @ResultSet
SELECT
CAST(M.MemberId AS BIGINT) AS member_id,
M.FirstName AS first_name,
M.LastName AS last_name,
CONCAT(M.FirstName, ' ', M.LastName) AS full_name,
M.HICN AS hicn,
CASE WHEN M.Gender = 0 THEN 'F' ELSE 'M' END AS gender,
CAST(P.VirtusPlanId AS BIGINT) AS plan_id,
P.PlanName AS plan_name,
M.PhoneNumber1 AS phone_1,
M.PhoneNumber2 AS phone_2,
M.DateOfBirth AS dob,
CAST(0 AS INT) AS total_records --<<< This is the place holder for total count, see further below
FROM
  Member M
  INNER JOIN [Plan] P
    ON M.PlanId = P.PlanId
  INNER JOIN @MemberList ML --<<< This is where the final filtering on ID's happens
    ON M.MemberId = ML.MemberId

/* Heres the core of what you are probably dealing with. The proc allows the caller
to specify sort order, query string, records per page, and current page in the set.
The users will usually search by name, sort by last name, and the grid then pages
through those results 10 rows (defaulted in app) at a time.
*/
DECLARE @InputPageNumber int = @page_num
DECLARE @RowsPerPage INT = @num_rows
DECLARE @RealPageNumber INT = @InputPageNumber + 1 --<<< 0 based index adjustment
DECLARE @OrderBy VARCHAR(255) = @order_by
DECLARE @SortOrder VARCHAR(4) = @sort_order


  SELECT
     member_id
    ,first_name
    ,last_name
    ,full_name
    ,hicn
    ,gender
    ,plan_id
    ,plan_name
    ,phone_1
    ,phone_2
    ,dob
    /* Here is your total row count of the set regardless of how many
    rows are being paged at the moment. Because OVER() is counting the
    filtered set in the table variable only, it is fast. It would do the same
    with a CTE. You're already using the same thing.
    */
    ,CAST(COUNT(*) OVER() AS INT) AS total_records
  FROM @ResultSet
  WHERE plan_id = @plan_long_id

    /* In this usecase, the output of this query feeds a paged web grid. Below is the 
    logic used to determine the sort column at execution time. I'm not pretending it's 
    pretty but it works.
    */
    ORDER BY CASE 
        WHEN @OrderBy = 'first_name' AND @SortOrder = 'ASC'
          THEN first_name
        WHEN @OrderBy = 'last_name' AND @SortOrder = 'ASC'
          THEN last_name
        WHEN @OrderBy = 'full_name' AND @SortOrder = 'ASC'
          THEN full_name
        WHEN @OrderBy = 'hicn' AND @SortOrder = 'ASC'
          THEN hicn
        WHEN @OrderBy = 'plan_name' AND @SortOrder = 'ASC'
          THEN plan_name
        WHEN @OrderBy = 'gender' AND @SortOrder = 'ASC'
          THEN gender
        END ASC,
        CASE 
        WHEN @OrderBy = 'first_name' AND @SortOrder = 'DESC'
          THEN first_name
        WHEN @OrderBy = 'last_name' AND @SortOrder = 'DESC'
          THEN last_name
        WHEN @OrderBy = 'full_name' AND @SortOrder = 'DESC'
          THEN full_name
        WHEN @OrderBy = 'hicn' AND @SortOrder = 'DESC'
          THEN hicn
        WHEN @OrderBy = 'plan_name' AND @SortOrder = 'DESC'
          THEN plan_name
        WHEN @OrderBy = 'gender' AND @SortOrder = 'DESC'
          THEN gender
        END DESC
  OFFSET (@RealPageNumber - 1) * @RowsPerPage ROWS  -- dynamic offset arithmetic to convert requested page number to row offset
  FETCH NEXT @RowsPerPage ROWS ONLY;                -- dynamic number of rows to display per page

row_number and count very slow in query with dynamic sql and CTE, how to implement skip and take

1 Answers1