10

I have a table as follows:

user_id email
u1      e1, e2
u2      null

My goal is to convert this into the following format:

user_id email
u1      e1
u1      e2
u2      null

So for this I am using the lateral view explode() function in Hive, as follows:

select *  FROM  table
       LATERAL VIEW explode (split(email ,',')) email AS email_id 

But doing this the u2 row is getting skipped as it has null value in email. How can we include the nulls too in the output?

Edit: I am using a workaround doing an union of this table with the base table without explode, but I think the data will be scanned one more time because of this. I wanted to know if there is a better way to do it.

1 Answers1

18

include OUTER in the query to get rows with NULL values

something like,

select *  FROM table LATERAL VIEW OUTER explode (  split (  email  ,','  ) ) email AS email_id;

check this link -> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews

Nirmal Ram
  • 1,180
  • 2
  • 9
  • 18