1

I have a json string stored in a string column in BigQuery. There is an Array in it. I would like to pick some fields from array and write its value to BQ columns.

For example - Consider a below json stored in BQ

{
  "pool": "mypool",
  "statusCode": "0",
  "payloads": [
    {
      "name": "request",
      "fullpath": "com.gcp.commontools.edlpayload.EDLPayloadManagerTest$Request",
      "jsonPayload": {
        "body": "{\"data\":\"foo\"}"
      },
      "orientation": "REQUEST",
      "httpTransport": {
        "httpMethod": "POST",
        "headers": {
          "headers": {
            "a": "1"
          }
        },
        "sourceEndpoint": "/v1/foobar"
      }
    },
    {
      "name": "response",
      "fullpath": "com.gcp.commontools.edlpayload.EDLPayloadManagerTest$Response",
      "jsonPayload": {
        "body": "{\"data\":\"bar\"}"
      },
      "orientation": "RESPONSE",
      "httpTransport": {
        "headers": {
          "headers": {
            "b": "2"
          }
        },
        "httpResponseCode": 200
      }
    },
    {
      "name": "attributes",
      "fullpath": "java.util.HashMap",
      "nameValuePairs": {
        "data": {
          "one": "1"
        }
      },
      "orientation": "TRANSITORY"
    }
  ],
  "uuid": "11EC-C714-8ADE2390-9619-1B80E63968CC",
  "payloadName": "my-overall-name"
}

Consider a target BQ table schema is

pool, requestFullPath, requestPayload, responseFullPath, responsePayload

From the above json, i would like to pick few json elements and map there value to a column in BQ. Please note, array of payload will be dynamic in nature. There can be only 1 payload in the payloads array or there can be multiple. And the order of them is not fixed. For example, request payload can come at [0]th position, 1st position etc.

arun
  • 388
  • 2
  • 17

1 Answers1

2

Consider below

select * from (
  select 
    json_value(json_col, '$.pool') as pool, 
    json_value(payload, '$.name') as name, 
    json_value(payload, '$.fullpath') as FullPath, 
    json_value(payload, '$.jsonPayload.body') as Payload, 
  from your_table t
  , unnest(json_extract_array(json_col, '$.payloads')) payload
)
pivot (any_value(FullPath) as FullPath, any_value(Payload) as Payload for name in ('request', 'response') )          

if applied to sample data in your question - output is

enter image description here

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • Hi Mikhail, To extract information from 3rd payload like '$.nameValuePairs.data.one' how would I do that. Below query gives me 2 rows instead of 1 select * from (select json_value(payloadArr, '$.name') as name, json_value(payloadArr, '$.fullpath') as fullPath, json_value(payloadArr, '$.jsonPayload.body') as payload, json_value(payloadArr, '$.nameValuePairs.data.one') as nv, from table t ,unnest(json_extract_array(payload, '$.payloads')) payloadArr ) pivot (any_value(fullPath) as fullPath , any_value(payload) as payload for name in('request', 'response', 'attributes')) – arun May 03 '22 at 13:49
  • And if I add nameValuePairs as pivot then i get extra null fields which I am not looking select * from (select json_value(payload,'$.pool') as pool, json_value(payloadArr, '$.name') as name, json_value(payloadArr, '$.fullpath') as fullPath, json_value(payloadArr, '$.jsonPayload.body') as payload, json_value(payloadArr, '$.nameValuePairs.data.one') as nv, from table t , unnest(json_extract_array(payload, '$.payloads')) payloadArr ) pivot (any_value(fullPath) as fullPath , any_value(payload) as payload, any_value(nv) as nv for name in ('request', 'response', 'attributes') ) – arun May 03 '22 at 14:14
  • Hi Mikhail I have posted follow up question here https://stackoverflow.com/questions/72108683/transforming-arrays-into-columns-in-bq – arun May 04 '22 at 06:19