0

Original Question - Transform Array into columns in BigQuery

Solution suggested in the original question works well when you want to extract the same information from the array elements. But in my case, the information that I want to extract from each array element can be different. For example- If you see the original question, the 3rd Array element we have doesn't have jsonPayload but instead it has nameValuePairs in it. If I use pivot there then unnecessary fields get created. How to avoid them, I know we can use EXCEPT but I don't think that is a good solution because If I have to choose different elements from each array element, it would be really a mess. As I can have 10+ payloads in the payloads array.

SQL -

select * from (
  select 
    json_value(payload,'$.pool') as pool,
    json_value(payloadArr, '$.name') as name, 
    json_value(payloadArr, '$.fullpath') as fullPath,
    json_value(payloadArr, '$.jsonPayload.body') as payload,
    json_value(payloadArr, '$.nameValuePairs.data.one') as nv, 
  from  table t
  , unnest(json_extract_array(payload, '$.payloads')) payloadArr
)
pivot (any_value(fullPath) as fullPath , any_value(payload) as payload,  any_value(nv) as nv for name in ('request', 'response', 'attributes') ) 

enter image description here

Sai Chandra Gadde
  • 2,242
  • 1
  • 3
  • 15
arun
  • 388
  • 2
  • 17
  • why not use bigquery JS UDF to have more control over the field filters & operations dynamically? https://stackoverflow.com/questions/68709560/bigquery-transform-generic-json-to-struct/68718701#68718701 – Logan May 04 '22 at 16:51
  • Hello @arunkindra. Can you let me know if your issue is resolved? – Kabilan Mohanraj May 10 '22 at 09:32
  • 1
    Hi @KabilanMohanraj it is working fine with the query suggested by Mikhail. The point to note that we need one unique identifier in all the records – arun May 11 '22 at 12:03

1 Answers1

1

Use below

select * from (
  select 
    json_value(payload,'$.pool') as pool,
    json_value(payloadArr, '$.name') as name, 
    json_value(payloadArr, '$.fullpath') as fullPath,
    coalesce(
      json_value(payloadArr, '$.jsonPayload.body'),
      json_value(payloadArr, '$.nameValuePairs.data.one')
    ) as payload, 
  from  table t
  , unnest(json_extract_array(payload, '$.payloads')) payloadArr
)
pivot (any_value(fullPath) as fullPath , any_value(payload) as payload for name in ('request', 'response', 'attributes') )     

with output

enter image description here

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • Hi @mikhail There is one problem with both the answer, I have more than one record in my table, it is just returning one record because of aggregate function any_value. How can I apply same operation on each row and get the above desired ans. – arun May 05 '22 at 07:05
  • do all records have same pool and name? I thought those two will define the row. isn't this the case? – Mikhail Berlyant May 05 '22 at 15:06
  • These can be the same, so we have one uuid which would be unique between each records. I just added them it is looks working fine. – arun May 06 '22 at 09:27