0

I've looked around for quite a while to try and find the answer to my specific issue but I'm not having any luck.

I'm using the following code to export to an XML Path:

SELECT 
    productid, title, 
    (SELECT colors + ' ' 
     FROM dbo.productdetails 
     WHERE (active = 1) 
       AND (productid = j.productid) 
     GROUP BY colors
     FOR XML PATH('')) AS 'color_tags'     
FROM   
    dbo.jewelry AS j

I have a group by applied and it's working but it's viewing things like "black blue" and "black green" as entire values so basically I'm getting duplicate words when it outputs the XML. I've also tried DISTINCT but it does the exact same thing.

I have some data stored in a table like this:

+-----------+-------------+
| productid | color_tags  |
+-----------+-------------+
|         1 | black       |
|         1 | black blue  |
|         1 | black green |
|         1 | blue green  |
|         1 | black       |
+-----------+-------------+

The data I want to have on a single line is like this (basically no duplicate values):

black, blue, green

But what I'm getting is this:

black, black blue, black green, blue green

Any help would be appreciated, thank you!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • Very poor schema design. What are you asking for is first parsing semi CSV(space is separator), then do DISTINCT/GROUP BY per productid and finally do STRING_AGG using XML trick. It is doable but is it worth doing it that way? I would normalize it instead of patching bad design. – Lukasz Szozda Aug 10 '17 at 16:59
  • @lad2025 So to normalize, would I need to create a color_tags table and use that to store all the tags for products? – casual-beast Aug 10 '17 at 17:14
  • What does `productdetails.colors` actually contain? One color or multiple? It looks like you're missing a `DISTINCT` before doing the whole string concatenation dance. – Jeroen Mostert Aug 10 '17 at 20:20
  • @JeroenMostert It contains multiple colors separated by spaces. I already tried using DISTINCT but I get the same result. – casual-beast Aug 11 '17 at 12:31
  • I see, your question is a bit unclear mixing unrelated tables and queries (`color_tags` isn't in the query, nor are you actually concatenating with commas). Yes, if your table already has denormalized columns, you need to split them first, then aggregate them back. This is painful; changing the design would be advisable. From SQL Server 2016, `STRING_SPLIT` simplifies splitting; from SQL Server 2017, `STRING_AGG` simplifies concatenation. Prior to these, aggregating can be done with the `FOR XML PATH` trick, but splitting is a [massive pain](https://stackoverflow.com/questions/2647). – Jeroen Mostert Aug 11 '17 at 12:37
  • @JeroenMostert Thanks for your help. I think I'm going to change the design like you said (even though it's gonna suck). It'll be better off in the long term. The database I'm working off of is 15+ years old so it's got a lot of stuff that isn't ideal :( – casual-beast Aug 15 '17 at 13:19

0 Answers0