I am looking for a custom function that can accept a STRUCT
and return some string output of it. For this example, let us just assume the types of the struct may be a STRING
, INT64
, or a DATE
. So we have something like this:
STRUCT(1 AS num, "hi" as str, DATE "2014-01-01" as date)
And in the output I would like to produce the following JSON obj, which is a list of triples -- [key, value, type]
, with the value
being the json-encoded value.
[
["num", 1, "INT64"],
["str", "hi", "STRING"],
["date", "2014-01-01", "DATE"]
]
How could I do this in BigQuery? So far I have some scaffolding down, but not quite sure how to get the "inspect" for the type or key of a value.
CREATE TEMP FUNCTION structFunc(obj ANY TYPE)
RETURNS json
AS (
JSON "[]"
);
SELECT
structFunc(STRUCT(1 as num,"x" as str, date "2014-01-01" as date))
I'm hoping there is something like a FOR
loop I can iterate over the k-v pairs.
Note, here is a related question -- How to extract all the keys in a JSON object with BigQuery, that shows how to get an arbitrary list of k,v pairs in a js
udf. However, it has the following two limitations:
- JS requires an input type. It cannot allow
ANY TYPE
, and you do you not know the struct beforehand. - If you convert to json you lose non-json types unless you preserve additional information someplace else. For example
DATE "2014-01-01"
and"2014-01-01"
would both be encoded the same in JSON even if they are two different types.
Additionally, if this is an easier way to solve this. I am fine having the STRUCT
as part of a stored table if it inspecting the INFORMATION_SCHEMA
makes this simpler to solve. For example:
CREATE TABLE `my-dataset.123`.tbl AS (
SELECT STRUCT(1 as num,"x" as str, date "2014-01-01" as date) as val
)
In this way perhaps we could create a js udf that has a signature something like this:
CREATE TEMP FUNCTION structFunc(obj ANY TYPE)
RETURNS json
LANGUAGE js
AS r"""
function toTriples(
obj_as_json:JSON,
metadata_from_information_schema:JSON
) {...}
""";
And here are two additional helpers to get the value as well as the info :
-- CREATE TABLE `my-dataset.123`.tbl AS (SELECT STRUCT(1 as num,"x" as str, date "2014-01-01" as date) as val);
-- Get the value
select to_json(val) from `my-dataset.123`.tbl
-- Get the info (array<field_path STRING, data_type STRING>)
select array(SELECT as struct
field_path, data_type
FROM
`my-dataset.123`.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE
table_name = 'tbl'
AND column_name = 'val')