I'm trying to create a table in hive based on some JSON data that I have in hadoop. The hard part is that I have a nested JSON where the data is not cleanly defined in key:value pairs:
{
"trafficSource":{
"source":"(direct)",
"medium":"(none)"
},
"device":{
"browser":"(not set)",
"browserVersion":"(not set)",
"operatingSystem":"(not set)",
"operatingSystemVersion":"(not set)",
"isMobile":false,
"flashVersion":"(not set)",
"javaEnabled":false,
"language":"(not set)",
"screenColors":"(not set)",
"screenResolution":"(not set)",
"deviceCategory":"desktop"
}
"hits":[
{
"customDimensions":[
{
"index":"1",
"value":"2"
},
{
"index":"2",
"value":"0"
}
],
}
],
}
I'd like to create one column in my hive table for each value in "hits.customDimensions" of the JSON. Each array contains an index and a value for that index. For example, a column for index=1 would be named "User_ID" and the value would come from the "value" parameter in that array.
Is there an easy way to create this HIVE table?
Thanks!