45

This is the below Hive Table

CREATE EXTERNAL TABLE IF NOT EXISTS SampleTable
(
USER_ID BIGINT,
NEW_ITEM ARRAY<STRUCT<PRODUCT_ID: BIGINT,TIMESTAMPS:STRING>>
)

And this is the data in the above table-

1015826235     [{"product_id":220003038067,"timestamps":"1340321132000"},{"product_id":300003861266,"timestamps":"1340271857000"}]

Is there any way I can get the below output from the HiveQL after exploding the array?

**USER_ID**  |  **PRODUCT_ID**  |   **TIMESTAMPS**
 ------------+------------------+----------------
1015826235      220003038067       1340321132000
1015826235      300003861266       1340271857000

Updated

I wrote this query to get the output in the above format, but it is not giving me the result in the way I wanted to.

SELECT myTable1.myCol1,myTable2.myCol2 FROM sampletable st LATERAL VIEW 
explode(st.purchased_item.product_id) myTable1 AS myCol1 LATERAL VIEW 
explode(st.purchased_item.timestamps) myTable2 AS myCol2;

Can anyone help me what wrong I am doing? Any suggestions will be appreciated.

arsenal
  • 23,366
  • 85
  • 225
  • 331
  • How about something like this? select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as timestamps from SampleTable LATERAL VIEW explode(new_item) exploded_table as prod_and_ts; – Mark Grover Jul 08 '12 at 15:27
  • @Mark,Thanks Mark, It worked, Can you post this as an answer so that I can accept it. And can you please also take a look into this SO question also. [http://stackoverflow.com/questions/11336950/joining-two-tables-in-hive-using-hiveqlhadoop](http://stackoverflow.com/questions/11336950/joining-two-tables-in-hive-using-hiveqlhadoop). As nobody has replied yet on this question. It will be of great help to me. Thanks for your time. – arsenal Jul 08 '12 at 18:46
  • Glad it helped. Posted the answer. Will take a look at the other question soon! – Mark Grover Jul 09 '12 at 12:56
  • hey webby i just wanted know its a vary good question you asked but i am stuck at first step only i am not able to create an array of struct in hive . it would be really vary helpful if you could help me create an array of struct – sachingupta Jan 06 '15 at 13:44

3 Answers3

86

You need to explode only once (in conjunction with LATERAL VIEW). After exploding you can use a new column (called prod_and_ts in my example) which will be of struct type. Then, you can resolve the product_id and timestamps members of this new struct column to retrieve the desired result.

SELECT
   user_id,
   prod_and_ts.product_id as product_id,
   prod_and_ts.timestamps as timestamps
FROM 
   SampleTable 
   LATERAL VIEW explode(new_item) exploded_table as prod_and_ts;
Mark Grover
  • 4,070
  • 22
  • 21
  • And one more question I have posted, as it is more kind of theoretical question related to Performance measurement.[http://stackoverflow.com/questions/11404163/custom-mapper-and-reducer-vs-hiveql](http://stackoverflow.com/questions/11404163/custom-mapper-and-reducer-vs-hiveql). I apologize if I am bothering you so much as on SO, there are not that much BIG DATA expert here. So that is the reason I am pinging you. Really Appreciated all your help... – arsenal Jul 10 '12 at 07:45
  • Hi Mark, Thanks for all your help. I have posted similar question related to Exploding Array of Struct in Hive but this time data is different somehwat. Can you please take a look if that is possible to do it? [http://stackoverflow.com/questions/11550651/exploding-array-of-struct-using-hiveql](http://stackoverflow.com/questions/11550651/exploding-array-of-struct-using-hiveql) – arsenal Jul 18 '12 at 21:30
  • Hi Mark, how can we explode and create view for multiple array> columns. could you please help on my request http://stackoverflow.com/questions/37282646/how-to-create-view-for-struct-fields-in-hive – Ranjith Sekar May 25 '16 at 06:40
  • Do you have any idea how can we do multiple explode? each record of mine has 2 column of struct which I need to explode them but when I use two consecutive explode, the time I am going to "group by" them by a key. It returns some error! – Reihan_amn Feb 22 '18 at 23:34
11

If you are on Hive 0.10 or later, you could also use inline(ARRAY<STRUCT[,STRUCT]>). It explodes an array of structs into a table.

Tariq
  • 34,076
  • 8
  • 57
  • 79
  • It's a useful answer, but it doesn't fully answers the question. This way, the the top-level field, i.e. `USER_ID` is not in the results. – Jakub Kukul Mar 13 '17 at 16:46
0

You can explode your array in the following manner:

select USER_ID,items from Sample_Table lateral view explode(NEW_ITEM) temp_table as items;
SachinT
  • 99
  • 1
  • 9