2

The following procedure takes forever. Its purpose is to concatenate a list of IDs into one string.

declare @validatedIDList nvarchar(max)

SELECT
    @validatedIDList = COALESCE(@validatedIDList + ', ', '') + CAST(si.SelectedID AS nvarchar(50))
FROM 
    dbo.SelectedID si

dbo.SelectedID is a table with only one int field. It contains about 60000 rows. I tried to change COALESCE to ISNULL but with no significant improvement. Any idea of how to make this request faster?

user3240131
  • 187
  • 1
  • 2
  • 11
  • 1
    Does that query even give you the _results_ you want? – D Stanley Feb 14 '14 at 14:34
  • Yes but it is taking much longer than before : about 10 minutes. It used to take 10 seconds with 40000 inputs on a different database. – user3240131 Feb 14 '14 at 14:39
  • 1
    Repeated string concatenation is often slow. `.NET` has a `StringBuilder` for example to avoid this. [This method is not guaranteed to work anyway](http://stackoverflow.com/questions/15138593/nvarchar-concatenation-index-nvarcharmax-inexplicable-behavior/15163136#15163136) – Martin Smith Feb 14 '14 at 15:01
  • 1
    For some reason, this warning is under [`SET @local_variable`](http://msdn.microsoft.com/en-us/library/ms189484.aspx) rather than `SELECT @local_variable`: "Do not use a variable in a SELECT statement to concatenate values (that is, to compute aggregate values). Unexpected query results may occur. This is because all expressions in the SELECT list (including assignments) are not guaranteed to be executed exactly once for each output row" – Damien_The_Unbeliever Feb 14 '14 at 15:24
  • @Damien_The_Unbeliever - Unfortunately they still don't abide by their own advice. e.g. See the script after "To update statistics for all memory-optimized tables in the current database, run the following script" [here](http://msdn.microsoft.com/en-us/library/dn232522(v=sql.120).aspx) – Martin Smith Feb 14 '14 at 15:27

3 Answers3

3

Running this on a table containing consecutive integers from 1 - 60,000 takes just over 5 minutes on my machine.

DECLARE @validatedIDList NVARCHAR(max)

SELECT @validatedIDList = COALESCE(@validatedIDList + ', ', '') + 
                          CAST(si.SelectedID AS NVARCHAR(50))
FROM   dbo.SelectedID si

SELECT internal_objects_alloc_page_count,
       internal_objects_dealloc_page_count,
       internal_objects_alloc_page_count - internal_objects_dealloc_page_count
FROM   sys.dm_db_task_space_usage
WHERE  session_id = @@SPID 


SELECT DATALENGTH(@validatedIDList)   

Profiling the process shows that it spends a lot of time performing Blob manipulation

enter image description here

To answer the question why this is so slow nvarchar(max) variables are stored as LOB data in tempdb on 8KB pages.

The final length of the string is 817,784 bytes (approx one hundred 8KB pages). The result of the query above is

+-----------------------------------+-------------------------------------+-------+
| internal_objects_alloc_page_count | internal_objects_dealloc_page_count |       |
+-----------------------------------+-------------------------------------+-------+
|                           5571528 |                             5571424 |   104 |
+-----------------------------------+-------------------------------------+-------+

Showing the final string actually consumes 104 pages in tempdb.

Dividing the alloc_page_count by 60,000 shows that the average number of pages allocated and deallocated per assignment is 93.

Each concatenation operation does not just append to the existing LOB data but instead makes a copy of it. As the string grows longer the amount of work per concatenation grows accordingly.

The XML PATH method is much more efficiently implemented and has the additional "bonus" that it is actually documented to work (unlike the method in the question)

The article Concatenating Row Values in Transact-SQL has a good round up of the available methods.

Community
  • 1
  • 1
Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • 1
    +1, Using `XML PATH` to concatenate values is documented by Microsoft [here](http://technet.microsoft.com/en-us/library/bb510419.aspx). – Mikael Eriksson Feb 14 '14 at 16:30
1

This can be done with the STUFF and FOR XML PATH commands

DECLARE @validatedIDList nvarchar(max)

SELECT @validatedIDList = STUFF((SELECT ', ' + COALESCE(CAST(si.SelectedID AS (nvarchar(50)), '')
FROM dbo.SelectedID si
FOR XML PATH('')), 1, 1, '')
TTeeple
  • 2,913
  • 1
  • 13
  • 22
1
DECLARE @SelectedID TABLE (IDs INT)
INSERT INTO @SelectedID VALUES
(1),(2),(null),(4),(5),(null),(7)


declare @validatedIDList nvarchar(max)


SET @validatedIDList =  STUFF((SELECT ', ' + CAST(si.IDs AS nvarchar(50))
                              FROM @SelectedID si
                              FOR XML PATH('')),1 ,2 ,'')

SELECT @validatedIDList


RESULT : 1, 2, 4, 5, 7
M.Ali
  • 67,945
  • 13
  • 101
  • 127