Improve performance of SQL query with a FOR XML PATH in SQL Server 2008

Question

I am trying to concatenate LoadIds for a user using For XML path which is a portion of my whole query, I verified that the maximum time gets elapsed in calculating this concatenated LoadId column. Below is the syntax, can anyone suggest a way to rewrite this efficiently?

SELECT 
    Col1, col2, 
    LoadIds = STUFF((SELECT ' , ' + CAST([LoadId] AS varchar(5))
                     FROM Table1 AS t1
                     WHERE t1.[UserId] = [t2].[UserId]
                     FOR XML PATH ('')), 1, 2, '')
FROM 
    Table1 AS t2
GROUP BY 
    [UserId]

What exactly about the performance is a problem for you? This looks like a very typical `FOR XML PATH` query in SQL Server. Do you have an index on `UserId`? — Tim Biegeleisen, Apr 17 '18 at 01:30
@TimBiegeleisen If I comment calculation of LoadIds, performance of the query is within 20 seconds and with LoadIds it goes upto a minute. I don't have index on UserId and I don't have permission to create — Naina, Apr 17 '18 at 01:46
@Naina... Is there any typo in posted query ? cause i see only `userid` in `group by` clause but the `select` statement has nos of `expression`. — Yogesh Sharma, Apr 17 '18 at 04:16
If you don't have the permission to create an index, this will not help you probably, but there is the chance to use a [CLR function](https://stackoverflow.com/q/48783205/5089204). — Shnugo, Apr 17 '18 at 07:20
@Naina, this one is also a good one for SQL CLR [GROUP_CONCAT](https://github.com/orlando-colamatteo/ms-sql-server-group-concat-sqlclr) — Biju jose, Apr 17 '18 at 07:39

Vadim Loboda · Answer 1 · 2018-04-17T07:41:41.743

2

1). I would try to collect the required data in a temp table or table variable, create an index on it and then play with the concatination.

2). FOR XML PATH works good for small sets of records, for large sets I would try a recursion.

declare @T table (
    UserId     int        not null,
    RowNumber  int        not null,
    LoadId     varchar(5) not null

    primary key clustered (UserId, RowNumber)
);

insert into @T
select
    UserId,
    row_number() over(partition by UserId order by LoadId),
    CAST(LoadId AS varchar(5))    
from
    Table1 ;


with cte (UserId, RowNumber, LoadIds) as 
(
    select 
        UserId, 
        RowNumber,
        LoadIds = convert(varchar(8000), LoadId)
    from @T
    where RowNumber = 1

    union all

    select 
        t.UserId, 
        t.RowNumber,
        convert(varchar(8000), cte.LoadIds + ', ' + t.LoadId)
    from 
        cte inner join @T t on t.UserId = cte.UserId and t.RowNumber = cte.RowNumber + 1
)
select UserId, LoadIds = max(LoadIds) from cte group by UserId;

edited Apr 17 '18 at 07:41

answered Apr 17 '18 at 07:32

Vadim Loboda

2,431
27
44

3

may be you should look this,https://sqlperformance.com/2014/08/t-sql-queries/sql-server-grouped-concatenation – Biju jose Apr 17 '18 at 07:41
But I've heard the opposite opinion and seen other results as well. That is why I wrote "I would try..." – Vadim Loboda Apr 17 '18 at 07:57
I usually keep some cheat sheets with me , like commonly used techniques to solve these type of problem , for grouped concatenation the best way to go is xml (some caveats are there which you should keep in mind) and CLR. A great read by late Dwain camps is here [High perf T SQL code patterns](https://dwaincsql.com/2015/05/27/high-performance-t-sql-using-code-patterns/) – Biju jose Apr 17 '18 at 07:58
even I tried r-cte some times until I came across with this. That's why pointed it out, now onwards you will keep an eye on r-cte :) – Biju jose Apr 17 '18 at 08:00
1

For others looking for a solution, I found that creating an index on the temp table as suggested by @VadimLoboda really sped up my query – RoyalSwish Jun 28 '21 at 13:21

Gordon Linoff · Answer 2 · 2018-04-17T02:44:43.000

0

One thing you can do is select the distinct user ids before doing the string aggregation:

SELECT t2.UserId, 
        STUFF((SELECT ' , ' + CAST([LoadId] AS varchar(8000))
               FROM Table1 t1
               WHERE t1.[UserId] = [t2].[UserId]
               FOR xml PATH ('')
              ), 1, 2, ''
             ) as LoadIds
FROM (SELECT DISTINCT userId
      FROM Table1 t2
     ) t2;

For performance, an index on table1(UserId, LoadId) would also help.

edited Apr 17 '18 at 02:44

answered Apr 17 '18 at 01:45

Gordon Linoff

1,242,037
58
646
786

Thanks for you suggestion but that didn't help – Naina Apr 17 '18 at 02:24

Improve performance of SQL query with a FOR XML PATH in SQL Server 2008

2 Answers2