SQL Server Collapsing multiple columns into a string from a one to many relationship

Question

Given a one to many where an Item can be in many Warehouses, doing a straight join returns duplicate items. I'd like to collapse the data from the warehouse table into strings so that there is one record per item. This works, but is there a more efficient approach that avoids doing 2 separate subselects/subqueries?

SELECT
    im.ItemID,
    (SELECT WareHouseID + ',' 
        FROM stockStatusSummary WHERE ItemKey = im.ItemKey for xml path ('')) 
    as 'WarehouseIDs',
    (SELECT Convert(varchar(20), Round(QtyAvailable,3)) + ',' 
        FROM stockStatusSummary
        WHERE ItemKey = im.ItemKey for xml path (''))
    as 'WarehouseQuantity'
FROM item im

With the output looking like this:

ItemID  WarehouseIDs  WarehouseQuantity
------- ------------- -----------------
1       10,20,30,      5,7,9,
2       20,30,40,      6,8,10,
3       30,40,50,      7,9,11,

Are you expecting the order of entries in the two sets of lists to be correlated, e.g. warehouse 30 has quantity 8 of item 2? — HABO, Jul 10 '13 at 19:51
Yes, that's ideal. Probably need explicit ordering in the query from stackStatus Summary. — RyanW, Jul 10 '13 at 20:00
Duplicate of this question? http://stackoverflow.com/questions/5452681/sql-aggregating-strings-together — tgolisch, Jul 10 '13 at 20:01
In MySQL, there is an aggregate function called group_concat(), which will do what you want. I'm not sure if it exists in SQL Server - and after a quick google search, I'm not optimistic that it does. However, there seem to be a bunch of articles out there (including a few on stackoverflow) about how to replicate the group_concat function in SQL Server — mti2935, Jul 10 '13 at 20:58
in oracle, this would be a user defined aggregate function. not sure if that is supported in sql-server. — Randy, Jul 10 '13 at 21:11

Adam Haines · Answer 1 · 2013-07-10T21:10:18.887

There is not really going to be a method that has a golden lining, without testing. You can try a different method to see if it worth your while. I would probably start with the scalar udf method to see if it is worth while. Also make sure to Order you data, so results are consistent.

You can find more detail here, http://jahaines.blogspot.in/2009/06/concatenating-column-values-part-1.html.. Here are some performance tests, http://jahaines.blogspot.in/2009/07/concatenating-column-values-part-2.html

Edit: I wanted to add that this would probably best be done in the application tier, or even a CLR. The bottom line is this is not one of SQL Server's strengths and you will probably have to try a few methods to see what works best for your environment. If you are mostly concerned with legibility the scalar udf method will probably look the cleanest, but may not necessarily be the most scalable.

Thanks for the answer! So far I've tried converting over to using a CTE to provide the data that the subselect is retrieving and there was no significant difference in IO or Time stats. In fact the IO stats were identical between the 2. I'll take a look at using a UDF and your other links too. — RyanW, Jul 11 '13 at 16:21

SQL Server Collapsing multiple columns into a string from a one to many relationship

1 Answers1