I have a (JSON) String column like this:
[{"someValue":6,"other":true, "maybeThisValue":"exists?"},{"someValue": "cool value", "other": null}]
I'd like to end up with this:
+------------+-------+----------------+
| someValue | other | maybeThisValue |
+------------+-------+----------------+
| 6 | true | exists? |
+------------+-------+----------------+
| cool value | | |
+------------+-------+----------------+
because the schema changes so much I would like to get the keys of the string and dynamically generate a record from it (no arrays or structs). The input JSON will always be an array of objects with single depth key-value (no childNodes).
I'm thinking maybe this isn't possible with one UDF because we have to specify the return value in advance and there could be additional columns. But maybe this is possible with two UDF? Can one UDF generate and save then execute another UDF?
Or maybe I can start with a UDF which will give me the unique keys of the (JSON) strings and then write a dbt macro which will use the return of that function call?? (I'm not sure this is possible)
I know how to do this in postgres but I'm wondering if you have any ideas to get it working in BigQuery or if it can be done in UDF that would be really cool even if it's a little slow
Converting data types is a pain but if it could detect the first 5 columns are all numbers and safe_cast to int that would be really really cool... I'm not sure how to do that though so string is fine
similar to this:
https://dba.stackexchange.com/questions/151838/postgresql-json-column-to-view
https://github.com/fishtown-analytics/dbt-utils/blob/master/macros/sql/get_column_values.sql
maybe it is possible with custom JSONPath '*.*'
: What JsonPath expressions are supported in BigQuery?
maybe it's not possible without array<struct>
:/
maybe this is possible with BigQuery scripting?? https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting
maybe I can use this for part of it How to extract all the keys in a JSON object with BigQuery
I can get the values like this...
CREATE TEMP FUNCTION json2array(json STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return JSON.parse(json).map(x=>JSON.stringify(x));
""";
CREATE TEMPORARY FUNCTION CUSTOM_JSON_EXTRACT(json STRING, json_path STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS """
return jsonPath(JSON.parse(json), json_path);
"""
OPTIONS (
library="gs://your_bucket/jsonpath-0.8.0.js"
);
select SPLIT(CUSTOM_JSON_EXTRACT(jsoncolumn,'*.'))
FROM `db.master`,
unnest(json2arrayUDF(JSON_EXTRACT(jsoncolumn, "$"))) jsoncolumn
it looks like:
|1 | 6 |
| | true |
| | exists? |
|2 | cool value |
| | |
| | |
now I just need to get the keys and then somehow match up the two things...