One of the way to go around NEST()
limitation of "nesting" just one field is to use BigQuery User-Defined Functions. They are extremely powerful yet still have some Limits and Limitations to be aware of. And most important from my prospective to have in mind - they are quite a candidates for being qualified as expensive High-Compute queries
Complex queries can consume extraordinarily large computing resources
relative to the number of bytes processed. Typically, such queries
contain a very large number of JOIN or CROSS JOIN clauses or complex
User-defined Functions.
So, below is example that "mimic" NEST(b, c) from example in questino:
SELECT a, d.b, d.c FROM
JS(( // input table
SELECT a, NEST(CONCAT(STRING(b), ',', STRING(c))) AS d
FROM (
SELECT * FROM
(SELECT 1 AS a, 5 AS b, 2 AS c),
(SELECT 1 AS a, 3 AS b, 1 AS c),
(SELECT 2 AS a, 2 AS b, 1 AS c)
) GROUP BY a),
a, d, // input columns
"[{'name': 'a', 'type': 'INTEGER'}, // output schema
{'name': 'd', 'type': 'RECORD',
'mode': 'REPEATED',
'fields': [
{'name': 'b', 'type': 'STRING'},
{'name': 'c', 'type': 'STRING'}
]
}
]",
"function(row, emit){ // function
var c = [];
for (var i = 0; i < row.d.length; i++) {
x = row.d[i].toString().split(',');
t = {b:x[0], c:x[1]}
c.push(t);
};
emit({a: row.a, d: c});
}"
)
It is relatively straightforward. I hope you will be able to walk through it and get an idea
Still - remember:
No matter how you create record with nested/repeated fields - BigQuery
automatically flattens query results, so visible results won't contain
repeated fields. So you should use it as a subselect that produces
intermediate results for immediate use by the same query.
As FYI, you can prove for yourself that above returns only two records (not three as it is looks like when it is flattened) by running below query
SELECT COUNT(1) AS rows FROM (
<above query here>
)
Another important NOTE:
This is a known that NEST()
is not compatible with UnFlatten Results
Output and mostly is used for intermediate result in subquery.
In contrast, above solution can be easily saved directly to table (with unchecked Flatten Results)