0

We have a result set that has three fields and each of those fields is either null or contains a comma separated list of strings.

We need to combine all three into one comma separated list and eliminate duplicates.

What is the best way to do that?

I found a nice function that can split a string and return a table:

T-SQL split string

I tried to create a UDF that would take three varchar parameters and call that split string function three times, combine them into one table, and then use a FOR XML from there and return it as one comma separated string.

But SQL is complaining about having a SELECT in a function.

Community
  • 1
  • 1
tnktnk
  • 512
  • 2
  • 7
  • 20

2 Answers2

2

Here's an example using the SplitString function you referenced.

DECLARE
    @X varchar(max) = 'A, C, F'
  , @Y varchar(max) = null
  , @Z varchar(max) = 'A, D, E, A'

;WITH SplitResults as
(
  -- Note: the function does not remove leading spaces.
  SELECT LTRIM([Name]) [Name] FROM SplitString(@X)
  UNION
  SELECT LTRIM([Name]) [Name] FROM SplitString(@Y)
  UNION
  SELECT LTRIM([Name]) [Name] FROM SplitString(@Z)
)
SELECT STUFF((
  SELECT ', ' + [Name]
  FROM SplitResults
  FOR XML PATH(''), TYPE
-- Note: here we're pulling the value out in case any characters were escaped, ie. &
--       and then STUFF is removing the leading ,<space>
).value('.', 'nvarchar(max)'), 1, 2, '')
jdl
  • 1,104
  • 8
  • 12
0

I would not store data as a comma separated string in a single field. Separate the string to a new table and combine it to a string again when you need to.

Finding duplicates and managing the data will also be much easier.

I've used this function before (I didn't write it, and unfortunately cannot remember where I found it) to split a string and add a key (in this case an int) to the data as a separate table, linking back to the original table's PK

CREATE FUNCTION SplitWithID (@id int, @sep VARCHAR(10), @s VARCHAR(MAX))
RETURNS @t TABLE
    (
        id int,
        val VARCHAR(MAX)
    )   
AS
    BEGIN
        DECLARE @xml XML
        SET @XML = N'<root><r>' + REPLACE(@s, @sep, '</r><r>') + '</r></root>'
        INSERT INTO @t(id,val)
        SELECT @id, r.value('.','VARCHAR(40)') as Item
        FROM @xml.nodes('//root/r') AS RECORDS(r)
        RETURN
    END
GO

Once you have the data on separate rows you can use any duplicate removal technique to clean the data before applying a primary key to the table.

Tony
  • 9,672
  • 3
  • 47
  • 75