2

I have a table in Hive as below -

create table somedf 
(sellers string , 
 orders int 
)

insert into somedf values
('1--**--2--**--3',50),
('1--**--2', 10)

The table has a column called sellers and it is delimited by the characters described in the insert statement. I would like to split the sellers into multiple rows so that it looks like below -

exploded_sellers orders
1                 50
2                 50
3                 50
1                 10
2                 10

I am trying to use lateral view explode() function in Hive but unable to get the results. I am using the below query -

select exploded_sellers, orders
from somedf
lateral view outer explode(split(sellers,'\\--*.*\\*.*--')) t1 as exploded_sellers

which gives me below results as output -

exploded_sellers orders
1                 50
3                 50
1                 10
2                 10

This result does not split Row 1('1--**--2--**--3',50) from the table as desired and ends up in producing only 2 rows instead of 3.

Is there any other function that is needed for this task? Does lateral view explode() only work on arrays ?

Regressor
  • 1,843
  • 4
  • 27
  • 67

2 Answers2

2

The pattern passed into split is incorrect. * character needs to be escaped. No need to escape -.

Use

select exploded_sellers, orders
from somedf
lateral view outer explode(split(sellers,'--\\*\\*--')) t1 as exploded_sellers
Vamsi Prabhala
  • 48,685
  • 4
  • 36
  • 58
1

This would work too. It expects two occurrences of * in the middle.

select exploded_sellers, orders
from somedf
lateral view outer explode(split(sellers,'--\\*{2}--')) t1 as exploded_sellers;
jose_bacoy
  • 12,227
  • 1
  • 20
  • 38