2

Previously someone asked about Turning a Comma Separated string into individual rows. This works great for one column, but I have an extension question. What if I want to split two (or more) related columns. That is, if I have data like this (using the data from the previous post as a base):

| SomeID         | OtherID     | Data      | RelatedData |
+----------------+-------------+-----------+-------------+
| abcdef-.....   | cdef123-... | 18,20,22  | xxx,,yyy    |
| abcdef-.....   | 4554a24-... | 17,19     | a,bb        |
| 987654-.....   | 12324a2-... | 13,19,20  | r,s,t       |

And want to return rows like the following:

| SomeID         | OtherID     | Data | RelatedData |
+----------------+-------------+------+-------------+
| abcdef-.....   | cdef123-... | 18   | xxx         |
| abcdef-.....   | cdef123-... | 20   |             |
| abcdef-.....   | cdef123-... | 22   | yyy         |
| abcdef-.....   | 4554a24-... | 17   | a           |
| abcdef-.....   | 4554a24-... | 19   | bb          |
| ...            | ...         | ...  | ...         |

I was hoping to use STRING_SPLIT due to its simplicity, but I cannot find a way to make it work. This does not work:

select OtherID, cs1.Value, cs2.Value
from yourtable
cross apply STRING_SPLIT (Data, ',') cs1
cross apply STRING_SPLIT (RelatedData, ',') cs2

Any suggestions?

dmonder
  • 353
  • 4
  • 12
  • `ROW_NUMBER()`, `JOIN`, then `CROSS APPLY`. – Jeroen Mostert Aug 24 '17 at 12:52
  • 3
    Time to redesign the database--this is the real answer. – Jacob H Aug 24 '17 at 12:53
  • Alternatively, if people can't be persuaded to normalize their data, consider JSON. Storing related data together would, at least, greatly simplify these issues. Hopping across "related" columns is very inconvenient no matter how you slice it. – Jeroen Mostert Aug 24 '17 at 12:58
  • I am trying to move data from a very bizarre format (Unidata with multi-valued fields) into a normalized database. Not an easy transition. – dmonder Aug 24 '17 at 13:50

1 Answers1

4

I agree with the comments regarding this design. It would be better to normalize your data.

To do this with string_split() you can use the row_number() workaround for an ordinal like so:

;with cs1 as (
  select SomeId, OtherId, x.Value
    , ItemNumber = row_number() over (partition by t.SomeId, t.OtherId order by (Select Null))
  from t
    cross apply string_split(Data,',') x
)
, cs2 as (
  select SomeId, OtherId, x.Value
    , ItemNumber = row_number() over (partition by t.SomeId, t.OtherId order by (Select Null))
  from t
    cross apply string_split(RelatedData,',') x
)
select cs1.SomeId, cs1.OtherId, cs1.Value, cs2.Value
from cs1 
  inner join cs2
    on cs1.SomeId = cs2.SomeId
   and cs1.OtherId = cs2.OtherId
   and cs1.ItemNumber = cs2.ItemNumber

dbfiddle.uk demo

returns:

+--------------+-------------+-------+-------+
|    SomeId    |   OtherId   | Value | Value |
+--------------+-------------+-------+-------+
| 987654-..... | 12324a2-... |    13 | r     |
| 987654-..... | 12324a2-... |    19 | s     |
| 987654-..... | 12324a2-... |    20 | t     |
| abcdef-..... | 4554a24-... |    17 | a     |
| abcdef-..... | 4554a24-... |    19 | bb    |
| abcdef-..... | cdef123-... |    18 | xxx   |
| abcdef-..... | cdef123-... |    20 |       |
| abcdef-..... | cdef123-... |    22 | yyy   |
+--------------+-------------+-------+-------+

I think it is simpler to use a custom function that includes an ordinal. For example, the delimited split inline table valued function by Jeff Moden:

select SomeId, OtherId, cs1.Item, cs2.Item
from t
  cross apply dbo.delimitedsplit8K(Data,',') cs1
  cross apply dbo.delimitedsplit8K(RelatedData,',') cs2
where cs1.ItemNumber = cs2.ItemNumber

returns the same results.


splitting strings reference:

SqlZim
  • 37,248
  • 6
  • 41
  • 59
  • This worked like a charm. I tried both, and do like the second version as it is much cleaner. Thanks! – dmonder Aug 24 '17 at 14:08