1

Good morning,

In big query I have a table with one column on the following form (I show 3 lines):

[{'a':1,'b':4, 'c':5}, {'a':0,'b':7, 'c':8},{'a':4,'b':9, 'c':12}]

[{'a':9,'b':10, 'c':9}]

[{ 'a':5,'b':10, 'c':9}, {'a':1,'b':10, 'c':9}, {'a':7,'b':10, 'c':9}]

that is, I have an array (Not fixed length) of Jsons. For every row I would like to extract (create a new column), if it is present, the value of the key 'b' when the key 'a'=1 (for every row the key 'a' can be equal 1 just one time). I can not import external packages.

Thank you for the help!

Sara
  • 43
  • 1
  • 4
  • Could you post a screenshot of those first rows (also an example of the wanted output)? Also it would be great to know the schema of the table, for what I see it looks like a nested repeated field would work quite well. – Iñigo Sep 27 '19 at 09:55
  • I wrote the first tree rows. There is just a table with one column. The output I want in this case is, one column with a numeric value, in this case 4 in the first row, and 10 in the third row. The second don't contain an interesting value. Sorry, I can not show the data. – Sara Sep 27 '19 at 09:59
  • Possible duplicate of [Bigquery - json\_extract all elements from an array](https://stackoverflow.com/questions/52120182/bigquery-json-extract-all-elements-from-an-array) – Pentium10 Sep 27 '19 at 10:00

2 Answers2

3

Below is for BigQuery Standard SQL

#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
  return JSON.parse(json).map(x=>JSON.stringify(x));
"""; 
WITH `project.dataset.table` AS (
  SELECT "[{'a':1,'b':4, 'c':5}, {'a':0,'b':7, 'c':8},{'a':4,'b':9, 'c':12}]" json UNION ALL
  SELECT "[{'a':9,'b':10, 'c':9}]" UNION ALL
  SELECT "[{ 'a':5,'b':10, 'c':9}, {'a':1,'b':10, 'c':9}, {'a':7,'b':10, 'c':9}]"
)
SELECT json, JSON_EXTRACT_SCALAR(x, '$.b') AS b
FROM `project.dataset.table`,
  UNNEST(json2array(JSON_EXTRACT(json, '$'))) x
WHERE JSON_EXTRACT_SCALAR(x, '$.a') = '1'

with result

Row json                                                                        b    
1   [{'a':1,'b':4, 'c':5}, {'a':0,'b':7, 'c':8},{'a':4,'b':9, 'c':12}]          4    
2   [{ 'a':5,'b':10, 'c':9}, {'a':1,'b':10, 'c':9}, {'a':7,'b':10, 'c':9}]      10   

In case if you need to preserve all original rows - use below

#standardSQL
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
  return JSON.parse(json).map(x=>JSON.stringify(x));
"""; 
WITH `project.dataset.table` AS (
  SELECT "[{'a':1,'b':4, 'c':5}, {'a':0,'b':7, 'c':8},{'a':4,'b':9, 'c':12}]" json UNION ALL
  SELECT "[{'a':9,'b':10, 'c':9}]" UNION ALL
  SELECT "[{ 'a':5,'b':10, 'c':9}, {'a':1,'b':10, 'c':9}, {'a':7,'b':10, 'c':9}]"
)
SELECT json, 
  (SELECT JSON_EXTRACT_SCALAR(x, '$.b')
  FROM UNNEST(json2array(JSON_EXTRACT(json, '$'))) x
  WHERE JSON_EXTRACT_SCALAR(x, '$.a') = '1'
  ) AS b
FROM `project.dataset.table`   

with result

Row json                                                                        b    
1   [{'a':1,'b':4, 'c':5}, {'a':0,'b':7, 'c':8},{'a':4,'b':9, 'c':12}]          4    
2   [{'a':9,'b':10, 'c':9}]                                                     null     
3   [{ 'a':5,'b':10, 'c':9}, {'a':1,'b':10, 'c':9}, {'a':7,'b':10, 'c':9}]      10  
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
1

As of 1st May 2020, JSON_EXTRACT_ARRAY function has been added, and can be used to retrieve array from json.

#standardSQL
WITH `yourTable` AS (
  SELECT "[{'a':1,'b':4, 'c':5}, {'a':0,'b':7, 'c':8},{'a':4,'b':9, 'c':12}]" json_blob UNION ALL
  SELECT "[{'a':9,'b':10, 'c':9}]" UNION ALL
  SELECT "[{ 'a':5,'b':10, 'c':9}, {'a':1,'b':10, 'c':9}, {'a':7,'b':10, 'c':9}]"
)
  SELECT
    (select 
         json_extract_scalar(split_items,'$.b') as b 
            from unnest(json_extract_array(json_blob)) split_items 
           where json_extract_scalar(split_items,'$.a')='1'
    ) as b
  from yourTable

returns:

enter image description here

Pentium10
  • 204,586
  • 122
  • 423
  • 502