0

I have an object in bigQuery that stores all possible parameters in the system. Therefore somewhere under the hood the 'parameter' object, has a lot of keys, and when I build a query to SELECT param it returns a lot of columns with null, and maybe only 1 with a value, which makes it impossible to analyze as the output table is incredibly wide.

How can I write the query so that it returns 1 column, with only the non null key/value pair?

i.e.

instead of returning:

param.phone, param.lob, param.destination, param.id, param.1, param.2 etc with null values

enter image description here

i want to see one column with value {"e_line_of_business":"internet"} or any other non-null key/values. It's ok to be stringified.

yulGM
  • 894
  • 1
  • 5
  • 14
  • Please have a look on this question and both answers: https://stackoverflow.com/questions/74358691/how-to-retrieve-the-list-of-dynamic-nested-keys-of-bigquery-nested-records/74359647#74359647 – Samuel Dec 03 '22 at 21:22
  • Thank you. That does look to be the right general direction. However in that example the output is just the non null keys, like `["param1", "param10"]`. I need to extract all key-value pairs that are not null, eg: `{"param1":"value_1", "param10":"value_10"}` – yulGM Dec 03 '22 at 23:47

1 Answers1

3

You might consider below approach.

WITH sample_data AS (
  SELECT STRUCT(STRING(null) AS phone, STRING(null) AS lob, STRING(null) AS destination, 'internet' AS e_line_of_business, STRING(null) AS param1) params
   UNION ALL
  SELECT STRUCT(STRING(null) AS phone, STRING(null) AS lob, STRING(null) AS destination, 'internet' AS e_line_of_business, 'value_1' AS param1)
   UNION ALL
  SELECT STRUCT('01012345678' AS phone, 'web' AS lob, STRING(null) AS destination, null AS e_line_of_business, null AS param1)
)
SELECT params, REPLACE(REGEXP_REPLACE(TO_JSON_STRING(params), r'"[^,{]+"\:null,?', ''), ',}', '}') non_nulls 
  FROM sample_data;

Query results

enter image description here

Jaytiger
  • 11,626
  • 2
  • 5
  • 15