0

Can someone suggest me how to write Hive SQL query with the below details.

  • input: (tab delimited data)

    Name Phone Address abc 123:456:789 hyderabad,bangalore,chennai xyz 111:222:333 delhi,mumbai,pune pqr 321:654:999 add1,add2,add3

  • output:

    Name Phone Address abc 123 hyderabad abc 456 bangalore abc 789 chennai xyz 111 delhi xyz 222 mumbai xyz 333 pune pqr 321 add1 pqr 654 add2 pqr 999 add3

Can someone help me how to split the column data and place it as a new record Without using any default functions. Thanks in advance.

Manindar
  • 999
  • 2
  • 14
  • 30
  • possible duplicate with this: https://stackoverflow.com/questions/20667473/hive-explode-lateral-view-multiple-arrays – jose_bacoy Feb 15 '18 at 17:36

2 Answers2

2

Use posexplode,split.Since you need the indexes of the split columns to match,you will need to use posexplode to create indexes and ensure they match when you convert the columns to rows.

select Name, Phone, Address 
from table 
      lateral view posexplode(split(Phone,':')) Phone AS index1, Phone
      lateral view posexplode(split(Address,',')) Address AS index2, Address
where index1=index2;
nobody
  • 10,892
  • 8
  • 45
  • 63
0

If we are not aware of the pre-defined functions used for this. We can write the custom logic like below (Only if we know the delimiters and #of splits).

select name, substr(phone,instr(phone,':',1)-1) phone, substr(address,instr(address,',',1)-1)
union all 
select name, substr(phone,instr(phone,':',2)-1) phone, substr(address,instr(address,',',2)-1)
union all 
select name, substr(phone,instr(phone,':',2)+1,length(phone)) phone, substr(address,instr(address,',',2)+1,length(address))

This may be helpful if we don't know the predefined methods.

Manindar
  • 999
  • 2
  • 14
  • 30