This is really ugly, really really ugly. I don't for one second suggest doing this in your RDBMS, and really I suggest you fix your data. You should not be storing "delimited" (I use that word loosely to describe your data) data in your tables you should be storing in in separate columns and rows. In this case, the first "code" should be in one column, with a one to many relationship with another table with the codes you're trying to extract.
As you haven't tagged or mentioned your Version of SQL Server I've used the latest SQL Server syntax. STRING_SPLIT
is available in SQL Server 2016+ and STRING_AGG
in 2017+. If you aren't using those versions you will need to replace those functions with a suitable alternative (I suggest delimitedsplit8k(_lead)
and FOR XML PATH
respectively).
Anyway, what this does. Firstly we need to fix that data to something more useable, so I change the double hyphens (--
) to a Pipe (|
), as that doesn't seem to appear in your data. Then then use that pipe to split your data into parts (individual codes).
Because your delimiter is inconsistent (it isn't a consistent width) this leaves some codes with a leading hyphen, so I have to then get rid of that. Then I use my answer from your other question to split the code further into it's components, and reverse the WHERE
; previously the answer was looking for "bad" rows, where as now we want "good" rows.
Then after all of that, it's as "simple" as using STRING_AGG
to delimit the "good" rows:
SELECT STRING_AGG(ca.Code,',') AS Codes
FROM (VALUES('3/1151---------366-500-2570533-1'),
('9/6809---------------------368-510-1872009-1'),
('1-260752-305-154----------------154-200-260752-1--------154-800-13557-1'),
('2397/35425---------------------------------377-500-3224575-1'),
('17059----------------377-500-3263429-1'),
('126/42906---------------------377-500-3264375-1'),
('2269/2340-------------------------377-500-3065828-1'),
('2267/767---------377-500-1452908-4'),
('2395/118593---------377-500-3284699-1'),
('2395/136547---------377-500-3303413-1'),
('92/10260---------------------------377-500-1636038-1'),
('2345-2064---------377-500-3318493-1'),
('365-2290--------377-500-3278261-12'),
('365-7212--------377-500-2587120-1')) V(Codes)
CROSS APPLY (VALUES(REPLACE(V.Codes,'--','|'))) D(DelimitedCodes)
CROSS APPLY STRING_SPLIT(D.DelimitedCodes,'|') SS
CROSS APPLY (VALUES(CASE LEFT(SS.[value],1) WHEN '-' THEN STUFF(SS.[value],1,1,'') ELSE SS.[value] END)) ca(Code)
CROSS APPLY (VALUES(PARSENAME(REPLACE(ca.Code,'-','.'),4),
PARSENAME(REPLACE(ca.Code,'-','.'),3),
PARSENAME(REPLACE(ca.Code,'-','.'),2),
PARSENAME(REPLACE(ca.Code,'-','.'),1))) PN(P1, P2, P3, P4)
WHERE LEN(PN.P1) = 3
AND LEN(PN.P2) = 3
AND LEN(PN.P3) BETWEEN 5 AND 7
AND LEN(PN.P4) BETWEEN 1 AND 2
AND ca.Code NOT LIKE '%[^0-9\-]%' ESCAPE '\'
GROUP BY V.Codes;
db<>fiddle