0

I have data that currently looks like this (pipe indicates separate columns):

ID | Sex | Purchase           | Type  
 1 | M   | Apple, Apple       | Food, Food  
 2 | F   | Pear, Barbie, Soap | Food, Toys, Cleaning  

As you can see, the Purchase and Type columns feature multiple values that are comma delimited (some of the cells in these columns actually have up to 50+ values recorded within). I want the data to look like this:

ID | Sex | Purchase | Type  
 1 | M   | Apple    | Food  
 1 | M   | Apple    | Food  
 2 | F   | Pear     | Food  
 2 | F   | Barbie   | Toys  
 2 | F   | Soap     | Cleaning

Any ideas on how would I be able to do this with SQL? Thanks for your help everyone.

Edit: Just to show that this is different to some of the other questions. The key here is that data for each unique row is contained across two separate columns i.e. the second word in "Purchase" should be linked with the second word in "Type" for ID #1. The other questions I've seen was where the multiple values had been contained in just one column.

cranpibae
  • 11
  • 1
  • 2
  • I dont think its a duplicate of either of these because the important part is the two nested different delimiters. – TomC Aug 23 '18 at 01:29
  • Sorry, I tried the other questions but I think mine is slightly different because the two columns have multiple values each delimited by commas. The tricky bit is how to append these into 5 rows (using an approach that takes the first, second, and third word from each of the delimited columns, for its own row), rather than 13 rows (if i were to perform a join function on the values of the two columns, creating 2*2 + 3*3 rows). – cranpibae Aug 23 '18 at 05:27

2 Answers2

1

Basically you will required a delimited spliter function. There are many around. Here i am using DelimitedSplit8K from Jeff Moden http://www.sqlservercentral.com/articles/Tally+Table/72993/

-- create the sample table
create table #sample
(
    ID  int,
    Sex char,
    Purchase    varchar(20),
    Type        varchar(20)
)

-- insert the sample data
insert into #sample (ID, Sex, Purchase, Type) select 1, 'M', 'Apple,Apple', 'Food,Food'
insert into #sample (ID, Sex, Purchase, Type) select 2, 'M', 'Pear,Barbie,Soap', 'Food,Toys,Cleaning'

select  s.ID, s.Sex, Purchase = p.Item, Type = t.Item
from    #sample s
        cross apply DelimitedSplit8K(Purchase, ',') p
        cross apply DelimitedSplit8K(Type, ',') t
where   p.ItemNumber    = t.ItemNumber

drop table #sample
Squirrel
  • 23,507
  • 4
  • 34
  • 32
0

EDIT: The original question as posted had the data as strings, with pipe characters as column delimiters and commas within the columns. The below solution works for that.

The question has since been edited to show that the input data is actually in columns, not as a single string.

I've left the solution here as an interesting version of the original question.


This is an interesting problem. I have a solution that works for a single row of your data. I dont know from the question if you are going to process it row by row, but I assume you will.

If so, this will work. I suspect there might be a better way using xml or without the temp tables, but in any case this is one solution.

declare @row varchar(1000); set @row='2 | F | Pear, Barbie, Soap | Food, Toys, Cleaning'

declare @v table(i int identity, val varchar(1000), subval varchar(100))
insert @v select value as val, subval from STRING_SPLIT(@row,'|')
cross apply (select value as subval from STRING_SPLIT(value,',') s) subval

declare @v2 table(col_num int, subval varchar(100), correlation int)
insert @v2
select col_num, subval,
DENSE_RANK() over (partition by v.val order by i) as correlation
from @v v
join (
    select val, row_number()over (order by fst) as Col_Num 
    from (select val, min(i) as fst  from @v group by val) colnum
    ) c on c.val=v.val
order by i

select col1.subval as ID, col2.subval as Sex, col3.subval as Purchase, col4.subval as Type
from @v2 col1
join @v2 col2 on col2.col_num=2
join @v2 col3 on col3.col_num=3
join @v2 col4 on col4.col_num=4 and col4.correlation=col3.correlation
where col1.col_num=1

Result is:

ID  Sex Purchase    Type
2    F   Pear    Food
2    F   Barbie  Toys
2    F   Soap    Cleaning
TomC
  • 2,759
  • 1
  • 7
  • 16