I am trying to write a BigQuery SQL function / stored procedure / table function that accepts as input:
- a
INT64
filter for theWHERE
clause, - a table name (
STRING
type) as fully qualified name e.g.project_id.dataset_name.table_name
The idea is to dynamically figure out the table name and provide a filter to slice the data to return as a table.
However if try to write a Table Function (TVF) and I use SET
to start dynamically writing the SQL to execute, then I see this error:
Syntax error: Expected "(" or keyword SELECT or keyword WITH but got keyword SET at [4:5]
If I try to write a stored procedure, then it expects BEGIN
and END
and throws this error:
Syntax error: Expected keyword BEGIN or keyword LANGUAGE but got keyword AS at [3:1]
If I try to add those, then I get various validation errors basically because I need to remove the WITH
using CTEs (Common Table Expression), and semicolons ;
etc.
But what I am really trying to do is using a table function:
- to combine some CTEs dynamically with those inputs above (e.g. the input table name),
- to
PIVOT
that data, - to then eventually return a table as a result of a
SELECT
.
A bit like producing a View that could be used in other SQL queries, but without creating the view (because the slice of data can be decided dynamically with the other INT64
input filter).
Once I dynamically build the SQL string I would like to EXECUTE IMMEDIATE
that SQL and provide a SELECT
as a final step of the table function to return the "dynamic table".
The thing is that:
- I don't know before runtime the name of this table.
- But I have all these tables with the same structure, so the SQL should apply to all of them.
Is this possible at all?
This is the not-so-working SQL I am trying to work around. See what I am trying to inject with %s
and num_days
:
CREATE OR REPLACE TABLE FUNCTION `my_dataset.my_table_func_name`(num_days INT64, fqn_org_table STRING)
AS (
-- this SET breaks !!!
SET f_query = """
WITH report_cst_t AS (
SELECT
DATE(start) as day,
entity_id,
conn_sub_type,
FROM `%s` AS oa
CROSS JOIN UNNEST(oa.connection_sub_type) AS conn_sub_type
WHERE
DATE(start) > DATE_SUB(CURRENT_DATE(), INTERVAL num_days DAY)
AND oa.entity_id IN ('my-very-long-id')
ORDER BY 1, 2 ASC
),
cst AS (
SELECT * FROM
(SELECT day, entity_id, report_cst_t FROM report_cst_t)
PIVOT (COUNT(*) AS connection_sub_type FOR report_cst_t.conn_sub_type IN ('cat1', 'cat2','cat3' ))
)
""";
-- here I would like to EXECUTE IMMEDIATE !!!
SELECT
cst.day,
cst.entity_id,
cst.connection_sub_type_cat1 AS cst_cat1,
cst.connection_sub_type_cat2 AS cst_cat2,
cst.connection_sub_type_cat3 AS cst_cat3,
FROM cst
ORDER BY 1, 2 ASC
);