490

Table is:

Id Name
1 aaa
1 bbb
1 ccc
1 ddd
1 eee

Required output:

Id abc
1 aaa,bbb,ccc,ddd,eee

Query:

SELECT ID, 
    abc = STUFF(
                 (SELECT ',' + name FROM temp1 FOR XML PATH ('')), 1, 1, ''
               ) 
FROM temp1 GROUP BY id

This query is working properly. But I just need the explanation how it works or is there any other or short way to do this.

I am getting very confused to understand this.

Himanshu
  • 31,810
  • 31
  • 111
  • 133
Puneet Chawla
  • 5,729
  • 4
  • 16
  • 33

8 Answers8

900

Here is how it works:

1. Get XML element string with FOR XML

Adding FOR XML PATH to the end of a query allows you to output the results of the query as XML elements, with the element name contained in the PATH argument. For example, if we were to run the following statement:

SELECT ',' + name 
              FROM temp1
              FOR XML PATH ('')

By passing in a blank string (FOR XML PATH('')), we get the following instead:

,aaa,bbb,ccc,ddd,eee

2. Remove leading comma with STUFF

The STUFF statement literally "stuffs” one string into another, replacing characters within the first string. We, however, are using it simply to remove the first character of the resultant list of values.

SELECT abc = STUFF((
            SELECT ',' + NAME
            FROM temp1
            FOR XML PATH('')
            ), 1, 1, '')
FROM temp1

The parameters of STUFF are:

  • The string to be “stuffed” (in our case the full list of name with a leading comma)
  • The location to start deleting and inserting characters (1, we’re stuffing into a blank string)
  • The number of characters to delete (1, being the leading comma)

So we end up with:

aaa,bbb,ccc,ddd,eee

3. Join on id to get full list

Next we just join this on the list of id in the temp table, to get a list of IDs with name:

SELECT ID,  abc = STUFF(
             (SELECT ',' + name 
              FROM temp1 t1
              WHERE t1.id = t2.id
              FOR XML PATH (''))
             , 1, 1, '') from temp1 t2
group by id;

And we have our result:

Id Name
1 aaa,bbb,ccc,ddd,eee
Nimantha
  • 6,405
  • 6
  • 28
  • 69
FutbolFan
  • 13,235
  • 3
  • 23
  • 35
  • 120
    You should work for Microsoft's documentation team (if any) – Fandango68 Jun 13 '17 at 23:49
  • 102
    @Fandango68 ,@FutbolFan - He can't work for Microsoft's documentation team. His explanations are too clear and too direct. ;-) – Chris Nov 18 '17 at 18:03
  • 6
    Good answer. I am surprised that Microsoft didn't make a more convenient way to combine strings until the String_Agg function added in 2017 (https://stackoverflow.com/a/42967358/2012446). I find stuff / xml path a very verbose/cumbersome way to do this. – ChrisProsser Oct 17 '18 at 10:15
  • 3
    @ChrisProsser I agree. Oracle has been ahead of Microsoft on this by introducing `LISTAGG` function in Oracle 11gR2. I do miss that functionality on days where I have to use this instead. https://www.techonthenet.com/oracle/functions/listagg.php – FutbolFan Oct 17 '18 at 22:40
  • 6
    Hello. In step 1, if you do: SELECT name FROM temp1 FOR XML PATH ('') ...you get aaabbb ...etc... I didn't realize this at first... Changing it to SELECT ''+name ...etc... removes the tags. – KevinVictor Oct 19 '18 at 16:30
  • 2
    @ChrisProsser - Sybase ASA has had a `list` function for decades. Unfortunately Microsoft based SQLServer on Sybase's ASE instead, and never bothered with a list function until last year. I agree - it's mind-boggling. And then they do, they call it `string_agg`. I'd have thought `list` was pretty obvious. – youcantryreachingme Nov 09 '18 at 02:37
  • 1
    @ChrisProsser but then, in Sybase ASA you can reference column aliases in `where` clauses without the convoluted syntax of CTEs in spite of MS insisting it is not possible due to order of evaluation operations. But I digress. – youcantryreachingme Nov 09 '18 at 02:39
  • 2
    I just want to say, I've referenced this answer probably like 15 times now, and finally just copied the example into my snips. Thanks @FutbolFan – JSONaLeo Feb 19 '19 at 18:56
  • 2
    Just wondering where it was obvious that adding generated calculated fields (rather than normal fields) creates no xml wrapper around the value... – Neil Walker Feb 20 '19 at 14:46
  • 1
    @KevinVictor, the For XML Path will actually put the name of the column before the value. Adding a ''+Name or doing a Convert(varchar(x),name) cause the column to have no name (you can test it in SSMS easilly by running just the select) and this is why you don't have it in the xml string. in this specific case, you also want the "," to be inserted between the elements so that 1 stone 2 birds. – Dominique Boucher Jun 23 '20 at 18:28
  • 1
    Thanks for the wonderful explanation for the STUFF function, I couldn't find clear and easy documentation like this anywhere else. – neeraja Sep 07 '20 at 09:28
  • what is the t2 after temp1 in step3? – user1169587 Aug 12 '21 at 03:56
  • @user1169587, it's a table alias. It's what the `t2` in `t1.id = t2.id` refers to. – Martha Jun 30 '22 at 19:04
  • @DominiqueBoucher "Convert(varchar(x),name) cause the column to have no name" isn't name possibly already a varchar type? Why use Convert() again? – Shad Jul 15 '22 at 13:20
  • ugh. i hate these implicit joins or whatever they are called, they always sneak up on me. Why can't SQL require that joins are always explicit? I have such a hard time wrapping my head around these invisible joins or even recognizing where they are and what they are doing. – Michael Aug 01 '22 at 18:15
  • Can you explain where the implicit join is in the query? I read that for an implicit join there must be "several tables in the from clause, separated with commas" but I don't see that here. – Michael Aug 01 '22 at 18:16
  • @kid : I'm not suggesting to do another convert, I'm saying that when you do "select convert or if you concatenate 2 columns with the +, the name of the column is "lost" in the rendering. In SSMS, if you do : Select name from sys.databases, it will show one column named "name". If you do select name +'' from sys,databases, it will show one column without name. Same for a convert. – Dominique Boucher Aug 23 '22 at 12:48
  • This was so helpful! I have been looking for an explanation of the parameters! – missscripty Sep 29 '22 at 17:56
109

This article covers various ways of concatenating strings in SQL, including an improved version of your code which doesn't XML-encode the concatenated values.

SELECT ID, abc = STUFF
(
    (
        SELECT ',' + name
        FROM temp1 As T2
        -- You only want to combine rows for a single ID here:
        WHERE T2.ID = T1.ID
        ORDER BY name
        FOR XML PATH (''), TYPE
    ).value('.', 'varchar(max)')
, 1, 1, '')
FROM temp1 As T1
GROUP BY id

To understand what's happening, start with the inner query:

SELECT ',' + name
FROM temp1 As T2
WHERE T2.ID = 42 -- Pick a random ID from the table
ORDER BY name
FOR XML PATH (''), TYPE

Because you're specifying FOR XML, you'll get a single row containing an XML fragment representing all of the rows.

Because you haven't specified a column alias for the first column, each row would be wrapped in an XML element with the name specified in brackets after the FOR XML PATH. For example, if you had FOR XML PATH ('X'), you'd get an XML document that looked like:

<X>,aaa</X>
<X>,bbb</X>
...

But, since you haven't specified an element name, you just get a list of values:

,aaa,bbb,...

The .value('.', 'varchar(max)') simply retrieves the value from the resulting XML fragment, without XML-encoding any "special" characters. You now have a string that looks like:

',aaa,bbb,...'

The STUFF function then removes the leading comma, giving you a final result that looks like:

'aaa,bbb,...'

It looks quite confusing at first glance, but it does tend to perform quite well compared to some of the other options.

Richard Deeming
  • 29,830
  • 10
  • 79
  • 151
  • 1
    Hi Richard, Thanks for answer. It's little bit confusing but i almost got complete explanation and logic. – Puneet Chawla Jul 04 '15 at 02:07
  • 3
    What's the use of Type in your query. I think it for defining, the result of XML path will be store in value (not sure explain it if wrong). – Puneet Chawla Jul 04 '15 at 02:12
  • 12
    @PuneetChawla: [The `TYPE` directive](https://msdn.microsoft.com/en-us/library/ms190025.aspx) tells SQL to return the data using the `xml` type. Without it, the data is returned as an `nvarchar(max)`. It's used here to avoid XML-encoding issues if there are special characters in the `name` column. – Richard Deeming Jul 06 '15 at 11:38
  • 1
    Adding STUFF and type/.value together for me had a big impact on query. i had Table valued function XML reader being costly. I left STUFF only and works for my case. – user3790083 Dec 06 '17 at 13:33
  • 1
    I notice that if you delete the parameter `TYPE` and you delete `.value('.', 'varchar(max)')` then it works the same. The `Type` seems to convert it into an XML data type, and the `.value('.', 'varchar(max)')` converts it back out. So you can remove both. (i'm no expert though.. maybe there is some advantage in including those two things?) – barlop Apr 12 '18 at 16:13
  • 3
    @barlop: As [the SimpleTalk article](https://www.red-gate.com/simple-talk/sql/t-sql-programming/concatenating-row-values-in-transact-sql/) explains, if you drop the `TYPE` and `.value('.', 'varchar(max)')`, then you can end up with XML-encoded entities in the result. – Richard Deeming Apr 17 '18 at 16:40
  • 2
    @RichardDeeming do you mean if the data contains or might contain angle brackets? – barlop Apr 17 '18 at 17:29
  • 1
    @barlop: Angle brackets, ampersands, quotes, apostrophes, or anything else that needs to be encoded for use in XML. – Richard Deeming Apr 18 '18 at 10:37
  • 1
    just had a problem where 'item 1 & item 2' was coming out as 'item 1 & item 2' - this answer fixes it (with the TYPE and .value...), thankyou – imma Jun 22 '18 at 10:19
  • 5
    _But, since you haven't specified an element name, you just get a list of values_, this is the insight I was missing. Thank you. – Adam Mar 06 '20 at 11:01
  • varchar(max) truncates my end result which is more than 'max' length, how can i prevent that? – Ferdinand Gaspar Aug 06 '20 at 16:05
  • @FerdinandGaspar `varchar(max)` can store up to 2GB. If you need to concatenate more than that, you'll need to do it outside of SQL Server. You won't be able to do it in .NET either, since .NET strings also have a maximum length of 2GB. – Richard Deeming Aug 07 '20 at 07:28
65

PATH mode is used in generating XML from a SELECT query

1. SELECT   
       ID,  
       Name  
FROM temp1
FOR XML PATH;  

Ouput:
<row>
<ID>1</ID>
<Name>aaa</Name>
</row>

<row>
<ID>1</ID>
<Name>bbb</Name>
</row>

<row>
<ID>1</ID>
<Name>ccc</Name>
</row>

<row>
<ID>1</ID>
<Name>ddd</Name>
</row>

<row>
<ID>1</ID>
<Name>eee</Name>
</row>

The Output is element-centric XML where each column value in the resulting rowset is wrapped in an row element. Because the SELECT clause does not specify any aliases for the column names, the child element names generated are the same as the corresponding column names in the SELECT clause.

For each row in the rowset a tag is added.

2.
SELECT   
       ID,  
       Name  
FROM temp1
FOR XML PATH('');

Ouput:
<ID>1</ID>
<Name>aaa</Name>
<ID>1</ID>
<Name>bbb</Name>
<ID>1</ID>
<Name>ccc</Name>
<ID>1</ID>
<Name>ddd</Name>
<ID>1</ID>
<Name>eee</Name>

For Step 2: If you specify a zero-length string, the wrapping element is not produced.

3. 

    SELECT   

           Name  
    FROM temp1
    FOR XML PATH('');

    Ouput:
    <Name>aaa</Name>
    <Name>bbb</Name>
    <Name>ccc</Name>
    <Name>ddd</Name>
    <Name>eee</Name>

4. SELECT   
        ',' +Name  
FROM temp1
FOR XML PATH('')

Ouput:
,aaa,bbb,ccc,ddd,eee

In Step 4 we are concatenating the values.

5. SELECT ID,
    abc = (SELECT   
            ',' +Name  
    FROM temp1
    FOR XML PATH('') )
FROM temp1

Ouput:
1   ,aaa,bbb,ccc,ddd,eee
1   ,aaa,bbb,ccc,ddd,eee
1   ,aaa,bbb,ccc,ddd,eee
1   ,aaa,bbb,ccc,ddd,eee
1   ,aaa,bbb,ccc,ddd,eee


6. SELECT ID,
    abc = (SELECT   
            ',' +Name  
    FROM temp1
    FOR XML PATH('') )
FROM temp1 GROUP by iD

Ouput:
ID  abc
1   ,aaa,bbb,ccc,ddd,eee

In Step 6 we are grouping the date by ID.

STUFF( source_string, start, length, add_string ) Parameters or Arguments source_string The source string to modify. start The position in the source_string to delete length characters and then insert add_string. length The number of characters to delete from source_string. add_string The sequence of characters to insert into the source_string at the start position.

SELECT ID,
    abc = 
    STUFF (
        (SELECT   
                ',' +Name  
        FROM temp1
        FOR XML PATH('')), 1, 1, ''
    )
FROM temp1 GROUP by iD

Output:
-----------------------------------
| Id        | Name                |
|---------------------------------|
| 1         | aaa,bbb,ccc,ddd,eee |
-----------------------------------
Neha Chopra
  • 1,761
  • 11
  • 11
  • 3
    You write "In Step 4 we are concatenating the values." But it's not clear why/how the `','` specified as column, combined with the `('')` after xml path, causes concatenation to occur – barlop Apr 12 '18 at 14:02
  • 1
    In Step 4, doing any string operation will use the specified wrapping element which is blank ('') for this case. – vCillusion May 28 '18 at 23:47
  • 9
    For anyone wondering about point 4 and why dissapears. It's because after concatenation Name with comma there is no longer column but just value, so SQL Server doesn't know what name for xml tag should be used. For example this query `SELECT 'a' FROM some_table FOR XML PATH('')` will produce: `'aaaaaaa'`. But if column name will be specified: `SELECT 'a' AS Col FROM some_table FOR XML PATH('')` you get result: `aaa` – anth Nov 18 '18 at 08:21
40

There is very new functionality in Azure SQL Database and SQL Server (starting with 2017) to handle this exact scenario. I believe this would serve as a native official method for what you are trying to accomplish with the XML/STUFF method. Example:

select id, STRING_AGG(name, ',') as abc
from temp1
group by id

STRING_AGG - https://msdn.microsoft.com/en-us/library/mt790580.aspx

EDIT: When I originally posted this I made mention of SQL Server 2016 as I thought I saw that on a potential feature that was to be included. Either I remembered that incorrectly or something changed, thanks for the suggested edit fixing the version. Also, pretty impressed and wasn't fully aware of the multi-step review process that just pulled me in for a final option.

Brian Jorden
  • 1,166
  • 10
  • 9
  • 6
    STRING_AGG is not in SQL Server 2016. It is said to be coming in "vNext". – N8allan Apr 26 '17 at 04:57
  • 1
    Oops, I did not mean to overwrite the edit from @lostmylogin sorry about that... That is who actually pushed through the correction edit. – Brian Jorden Nov 06 '17 at 18:34
7

In for xml path, if we define any value like [ for xml path('ENVLOPE') ] then these tags will be added with each row:

<ENVLOPE>
</ENVLOPE>
Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
vikas
  • 81
  • 1
  • 1
4
SELECT ID, 
    abc = STUFF(
                 (SELECT ',' + name FROM temp1 FOR XML PATH ('')), 1, 1, ''
               ) 
FROM temp1 GROUP BY id

Here in the above query STUFF function is used to just remove the first comma (,) from the generated xml string (,aaa,bbb,ccc,ddd,eee) then it will become (aaa,bbb,ccc,ddd,eee).

And FOR XML PATH('') simply converts column data into (,aaa,bbb,ccc,ddd,eee) string but in PATH we are passing '' so it will not create a XML tag.

And at the end we have grouped records using ID column.

Fmanin
  • 519
  • 1
  • 12
  • 25
3

I did debugging and finally returned my 'stuffed' query to it it's normal way.

Simply

select * from myTable for xml path('myTable')

gives me contents of the table to write to a log table from a trigger I debug.

Martin Evans
  • 45,791
  • 17
  • 81
  • 97
SlavaTT
  • 88
  • 9
2
Declare @Temp As Table (Id Int,Name Varchar(100))
Insert Into @Temp values(1,'A'),(1,'B'),(1,'C'),(2,'D'),(2,'E'),(3,'F'),(3,'G'),(3,'H'),(4,'I'),(5,'J'),(5,'K')
Select X.ID,
stuff((Select ','+ Z.Name from @Temp Z Where X.Id =Z.Id For XML Path('')),1,1,'')
from @Temp X
Group by X.ID
ADyson
  • 57,178
  • 14
  • 51
  • 63
Omkar Naik
  • 51
  • 1