0

Given a one to many where an Item can be in many Warehouses, doing a straight join returns duplicate items. I'd like to collapse the data from the warehouse table into strings so that there is one record per item. This works, but is there a more efficient approach that avoids doing 2 separate subselects/subqueries?

SELECT
    im.ItemID,
    (SELECT WareHouseID + ',' 
        FROM stockStatusSummary WHERE ItemKey = im.ItemKey for xml path ('')) 
    as 'WarehouseIDs',
    (SELECT Convert(varchar(20), Round(QtyAvailable,3)) + ',' 
        FROM stockStatusSummary
        WHERE ItemKey = im.ItemKey for xml path (''))
    as 'WarehouseQuantity'
FROM item im

With the output looking like this:

ItemID  WarehouseIDs  WarehouseQuantity
------- ------------- -----------------
1       10,20,30,      5,7,9,
2       20,30,40,      6,8,10,
3       30,40,50,      7,9,11,
RyanW
  • 5,338
  • 4
  • 46
  • 58
  • Are you expecting the order of entries in the two sets of lists to be correlated, e.g. warehouse 30 has quantity 8 of item 2? – HABO Jul 10 '13 at 19:51
  • Yes, that's ideal. Probably need explicit ordering in the query from stackStatus Summary. – RyanW Jul 10 '13 at 20:00
  • Duplicate of this question? http://stackoverflow.com/questions/5452681/sql-aggregating-strings-together – tgolisch Jul 10 '13 at 20:01
  • In MySQL, there is an aggregate function called group_concat(), which will do what you want. I'm not sure if it exists in SQL Server - and after a quick google search, I'm not optimistic that it does. However, there seem to be a bunch of articles out there (including a few on stackoverflow) about how to replicate the group_concat function in SQL Server – mti2935 Jul 10 '13 at 20:58
  • in oracle, this would be a user defined aggregate function. not sure if that is supported in sql-server. – Randy Jul 10 '13 at 21:11

1 Answers1

1

There is not really going to be a method that has a golden lining, without testing. You can try a different method to see if it worth your while. I would probably start with the scalar udf method to see if it is worth while. Also make sure to Order you data, so results are consistent.

You can find more detail here, http://jahaines.blogspot.in/2009/06/concatenating-column-values-part-1.html.. Here are some performance tests, http://jahaines.blogspot.in/2009/07/concatenating-column-values-part-2.html

Edit: I wanted to add that this would probably best be done in the application tier, or even a CLR. The bottom line is this is not one of SQL Server's strengths and you will probably have to try a few methods to see what works best for your environment. If you are mostly concerned with legibility the scalar udf method will probably look the cleanest, but may not necessarily be the most scalable.

Adam Haines
  • 900
  • 5
  • 7
  • Thanks for the answer! So far I've tried converting over to using a CTE to provide the data that the subselect is retrieving and there was no significant difference in IO or Time stats. In fact the IO stats were identical between the 2. I'll take a look at using a UDF and your other links too. – RyanW Jul 11 '13 at 16:21