SQL group by with a combine to find duplicates in delimited string

Question

I am having issues trying to parse a set of data that contains a column with 1 or more values that are combined by ";". Here is a sample (remember I didnt set up this data configuration, I just have to deal with it):

Create Table #testData
(
    team varchar(20),
    sdate datetime,
    unqKeys varchar(max)
)

Insert Into #testData values ('1','2/1/2017','abcd123;qazwsx123');
Insert Into #testData values ('1','2/1/2017','qweasd123');
Insert Into #testData values 
('1','3/1/2017','qweasd123;564plmnko;poilkj765');
Insert Into #testData values ('2','2/1/2017','ug56ogth4;uhntg456f');
Insert Into #testData values ('2','4/1/2017','fghrty987');
Insert Into #testData values ('2','4/1/2017','ertdfg546;werstf234;qweasd123');

Select * From #testData

What I need to see is where [team] has a value that repeats in the [unqKeys] column. I have tried to use a COALESCE on the unqKeys but I am not able to group them all by team. I need a way to combine all the unqKeys for one team then step through the list checking for duplicates. Since this is SQL Server 2012, I do not have the split_string function but I did attempt to make one with success but I still was not getting them combined with the team id.

In the above example I need to see a positive find for:

team | dupKeys
----   -------
1      qweasd123

This is because the unqKey is present in more than one record for the distinct team.

I have tried to select the team and coalesce the unqKey but this fails with a group by because the combined, unqKey is not a part of the aggregate and if added I just get the results as you see them in the example. I tried to use my split function but then I lose the association to the team but get a table with all unqKeys.

Solution used from @xQbert comments and link:

--Shows the data
Select * From #testData

--Showing the failure
Select Distinct b.team, b.uKey as dupKey from (
    Select a.team
          ,split.a.value('.','nvarchar(50)') as uKey
    From (
        Select team
              ,Cast('<u>' + REPLACE(unqKeys, ';','</u><u>') + '</u>' as XML) as uKey
    From #testData
    ) as a CROSS APPLY uKey.nodes ('/u') as split(a)
) as b
Group By b.team, b.uKey
Having Count(*) > 1

Use for xml path and a cte to normalize the data by splitting the string to rows then group by and having to find dups... (http://stackoverflow.com/questions/13873701/convert-comma-separated-column-value-to-rows) — xQbert, May 02 '17 at 15:31
Glad I could help. Sometimes it's not about syntax just a push with an approach! — xQbert, May 02 '17 at 17:24

SQL group by with a combine to find duplicates in delimited string

0 Answers0