I want to list the tables which can be used as wildcards in BigQuery.
My dataset has the table list is similar to the following:
events_122022
events_122021
events_122020
...
...
events_112012
...
...
analytics_122022
analytics_122021
analytics_122020
...
...
analytics_112012
These tables are created dynamically and I have no information on the used table prefix Is there a way to find the list of tables which can be used dynamically?
The result should be:[events_, analytics_]
My attempt:
Find the tables with similar DDL using the following SQL
SELECT
SUBSTR(ddl, STRPOS(ddl, '(')) as commonDDL,
STRING_AGG(table_name) as table
FROM
dataset.INFORMATION_SCHEMA.TABLES
GROUP BY SUBSTR(ddl, STRPOS(ddl, '('))
This gives the output as :
commonDDL | table |
---|---|
(ID STRING, ...) | events_122022, events_122021, ... |
(NAME STRING, ...) | analytics_112022, analytics_112021 ... |
Now using a Longest common shared start algorithm I can find the required result. (Longest common start code here )
What are the other ways we can approach this problem?
Couldn't find anything on BigQuery docs.
Note: I only have readonly permission for the BigQuery dataset