-1

I am trying to create a process in SQL Server now that was completed in Hive using the Posexplode function.

I have a large dataset that looks like:

    userid     sku                         qty
1   abc        2427022                      3
2   abc        1883575|2427022             1|3
3   def        2427022|1562336|153842      1|1|1
4   ghi        2427022                       3

I would like the data to look like:

    userid  sku qty
1   abc 2427022 3
2   abc 1883575 1
3   abc 2427022 3
4   def 2427022 1
5   def 1562336 1
6   def 153842  1
7   ghi 2427022 3

Thanks!

Ed M.
  • 21
  • 3
  • 1
    https://stackoverflow.com/questions/5493510/turning-a-comma-separated-string-into-individual-rows – axawire Dec 06 '17 at 22:03
  • 1
    SO is not a tutorial site or a pool of ppl who do your work. Instead ask a specific question with code you have tried to fix your problem. Please also read https://stackoverflow.com/help/how-to-ask – common sense Dec 06 '17 at 22:03
  • In addition to the link you will probably have to add row_number() to the selects so you can combine the correct values in the sku and qty fields that belong together in the same row. – axawire Dec 06 '17 at 22:09

1 Answers1

1

I figured it out. I used the below code to get the desired output if someone is having a similar problem separating 2 delimited columns.I used the Cross Apply function and XML in SQL Server to get the same output as using the posexplode function in Hive.

--Create test table
create table Table1 (userid varchar(max), Sku varchar(max), Qty varchar(max))
insert Table1 select 'abc', '2427022' , '3'
insert Table1 select 'abc', '1883575|2427022', '1|3'
insert Table1 select 'def', '2427022|1562336|153842' , '1|1|1'
insert Table1 select 'ghi', '2427022' , ' 3'


;WITH CTE1
AS
(
  SELECT RN= Row_Number () over(Order by userid),userid,  
     Split.a.value('.', 'VARCHAR(100)') AS Sku
 FROM  
 (
     SELECT Userid,Sku,  
         CAST ('<M>' + REPLACE(sku, '|', '</M><M>') + '</M>' AS XML) AS Data  
     FROM  Table1
 ) AS A CROSS APPLY Data.nodes ('/M') AS Split(a)
 )
,CTE2 
AS
(
   SELECT RN= Row_Number () over(Order by userid),userid,  
     Split.b.value('.', 'VARCHAR(100)') AS qty
 FROM  
 (
     SELECT Userid,qty,  
         CAST ('<M>' + REPLACE(qty, '|', '</M><M>') + '</M>' AS XML) AS Data  
     FROM  Table1
 ) AS B CROSS APPLY Data.nodes ('/M') AS Split(b)
 )
 Select c.userid, c.Sku, d.qty
 from CTE1 as c
 Inner Join CTE2 as d
 on c.RN =d.RN
Ed M.
  • 21
  • 3