Turning related Comma Separated strings into individual rows with multiple columns with one to many

Question

I was looking at the post Turning related Comma Separated strings into individual rows with multiple columns. This works great for me when the data Data is one to one Related Data. But I need more data like Data to Many RelatedDate. I use [] to group each RelatedData to each section of Data.

| SomeID         | OtherID     | Data      | RelatedData |
+----------------+-------------+-----------+-------------+
| abcdef-.....   | cdef123-... | 18,20,22  | [xxx,xxxx],[xxxx],[yyy]|
| abcdef-.....   | 4554a24-... | 17,19     | [a],[bb]     |
| 987654-.....   | 12324a2-... | 13,19,20  | [r],[s],[t]  |

I want to return rows like the following:

| SomeID         | OtherID     | Data | RelatedData |
+----------------+-------------+------+-------------+
| abcdef-.....   | cdef123-... | 18   | xxx         |
| abcdef-.....   | cdef123-... | 20   | xxxx        |
| abcdef-.....   | cdef123-... | 20   |             |
| abcdef-.....   | cdef123-... | 22   | yyy         |
| abcdef-.....   | 4554a24-... | 17   | a           |
| abcdef-.....   | 4554a24-... | 19   | bb          |
| ...            | ...         | ...  | ...         |

I've tried to modify one of the answers from the post to fit my needs but I have no luck as it is pushing my data to a new row.

;WITH cte AS (
  SELECT SomeId, OtherId, TRIM(cs1.Value) AS DataValue, TRIM(REPLACE(REPLACE(cs2.Value, '[', ''), ']', '')) AS RelatedDataValue,
    DataItemNumber = ROW_NUMBER() OVER (PARTITION BY t.SomeId, t.OtherId, cs1.Value ORDER BY (SELECT NULL)),
    RelatedDataItemNumber = ROW_NUMBER() OVER (PARTITION BY t.SomeId, t.OtherId, cs2.Value ORDER BY (SELECT NULL))
  FROM t
  CROSS APPLY STRING_SPLIT(Data, ',') cs1
  CROSS APPLY STRING_SPLIT(REPLACE(REPLACE(RelatedData, '[', ''), ']', ''), ',') cs2
)
SELECT cte.SomeId, cte.OtherId, cte.DataValue, cte.RelatedDataValue
FROM cte
WHERE cte.DataItemNumber = cte.RelatedDataItemNumber
ORDER BY cte.SomeId, cte.OtherId, cte.DataItemNumber;

My suggestion: Normalize your CSV data _before_ bringing it into SQL Server. — Tim Biegeleisen, Jun 02 '23 at 03:53
18,20,22 & [xxx,xxxx],[xxxx],[yyy] I imagine would be 18/xxx,xxxx (causing 2 rows?) 20/xxxx and 22/yyy - please double check your example table — Paul Maxwell, Jun 02 '23 at 04:08
Please put all & only what is needed to ask in your post. Quote with credit & relate to your post. Don't expect us to read an entire reference & guess what is relevant & why. — philipxy, Jun 02 '23 at 09:35
Debug questions require a [mre]--cut & paste & runnable code including initialization; desired & actual output (including verbatim error messages); tags & versions; clear specification & explanation. For SQL include DDL & tabular initialization code. For debug that includes the least code you can give that is code that you show is OK extended by code that you show is not OK. [ask] [Help] When you get a result you don't expect, pause your overall goal, chop to the 1st subexpression with unexpected result & say what you expected & why, justified by documentation. (Debugging fundamental.) — philipxy, Jun 02 '23 at 09:36
PLease do not [delete & repost](https://stackoverflow.com/q/76386365/3404097) questions. You are wasting reader efforts & subverting site protocols. Especially do not repost a closed post, you were told to edit until it passes review. PS It is a violation of site protocols to use multiple accounts to do what is not permitted with 1. — philipxy, Jun 02 '23 at 10:04
In this post's 1st version. you wrote "this my first time posting on stack overflow." — philipxy, Jun 02 '23 at 10:32

Paul Maxwell · Accepted Answer · 2023-06-02T04:43:40.760

This may only take you one step further, as it appears you have "related data" that is a one to many relationship to data. However this may assist. One of the great problems with string_split is that it does NOT supply the index reference of each split and so to get around this you can emulate this by using row_number. Alternatively find a "split string" user defined function and that does output the index (these do exist). So without going into too much detail, and using split_string, you can align the parts to each other this way:

Note I replaced ,[ and [ in RelatedData so that only ] is needed as the delimiter.

--DDL
CREATE TABLE T (
    SomeID VARCHAR(255),
    OtherID VARCHAR(255),
    Data VARCHAR(255),
    RelatedData VARCHAR(255)
);

--DML !! NB SomeID is now unique !!
INSERT INTO T (SomeID, OtherID, Data, RelatedData) 
VALUES 
('abcdef-1....', 'cdef123-...', '18,20,22', '[xxx,xxxx],[xxxx],[yyy]'),
('abcdef-2....', '4554a24-...', '17,19', '[a],[bb]'),
('987654-3....', '12324a2-...', '13,19,20', '[r],[s],[t]');

3 rows affected

update T
set RelatedData = replace(replace(RelatedData,',[',''),'[','')

3 rows affected

select * from t

SomeID	OtherID	Data	RelatedData
abcdef-1....	cdef123-...	18,20,22	xxx,xxxx]xxxx]yyy]
abcdef-2....	4554a24-...	17,19	a]bb]
987654-3....	12324a2-...	13,19,20	r]s]t]

select t.someID, cs1.*, row_number() over(partition by t.SomeID order by (select 1)) rn
FROM t
  CROSS APPLY STRING_SPLIT(Data, ',') cs1

someID	value	rn
987654-3....	13	1
987654-3....	19	2
987654-3....	20	3
abcdef-1....	18	1
abcdef-1....	20	2
abcdef-1....	22	3
abcdef-2....	17	1
abcdef-2....	19	2

select t.someID, cs2.*, row_number() over(partition by t.SomeID order by (select 1)) rn
FROM t
  CROSS APPLY STRING_SPLIT(RelatedData, ']') cs2

someID	value	rn
987654-3....	r	1
987654-3....	s	2
987654-3....	t	3
987654-3....		4
abcdef-1....	xxx,xxxx	1
abcdef-1....	xxxx	2
abcdef-1....	yyy	3
abcdef-1....		4
abcdef-2....	a	1
abcdef-2....	bb	2
abcdef-2....		3

with pdata as (
select t.someID, cs1.*, row_number() over(partition by t.SomeID order by (select 1)) rn
FROM t
  CROSS APPLY STRING_SPLIT(Data, ',') cs1
)
, prelated as (
select t.someID, cs2.*, row_number() over(partition by t.SomeID order by (select 1)) rn
FROM t
  CROSS APPLY STRING_SPLIT(RelatedData, ']') cs2
  )
select
*
from pdata
left join prelated on pdata.SomeID = prelated.SomeID
             and pdata.rn = prelated.rn
order by 1,2

someID	value	rn	someID	value	rn
987654-3....	13	1	987654-3....	r	1
987654-3....	19	2	987654-3....	s	2
987654-3....	20	3	987654-3....	t	3
abcdef-1....	18	1	abcdef-1....	xxx,xxxx	1
abcdef-1....	20	2	abcdef-1....	xxxx	2
abcdef-1....	22	3	abcdef-1....	yyy	3
abcdef-2....	17	1	abcdef-2....	a	1
abcdef-2....	19	2	abcdef-2....	bb	2

fiddle 1

Note I believe that 18 may need to be further divided using a final string_split - but this varies to what is shown in the question.

with pdata as (
select t.someID, cs1.*, row_number() over(partition by t.SomeID order by (select 1)) rn
FROM t
  CROSS APPLY STRING_SPLIT(Data, ',') cs1
)
, prelated as (
select t.someID, cs2.*, row_number() over(partition by t.SomeID order by (select 1)) rn
FROM t
  CROSS APPLY STRING_SPLIT(RelatedData, ']') cs2
  )
select
    pdata.SomeID
  , pdata.value as data
  , cs3.value   as related
from pdata
inner join prelated on pdata.SomeID = prelated.SomeID
                   and pdata.rn = prelated.rn
CROSS APPLY STRING_SPLIT(prelated.value, ',') cs3
order by 1,2

fiddle 2

Turning related Comma Separated strings into individual rows with multiple columns with one to many

1 Answers1