1

Is it possible to iterate over multiple ga_sessions tables in BigQuery?

I'm not asking about how to query multiple ga_sessions tables. Which can be done in Standard SQL using: select * from 'a:b.c.ga_sessions_*' where _TABLE_SUFFIX between '20180501' and '20180505'.

I am asking about using a loop with date for example the answer in While loop over tables uses variables and declare cursor.

The reason is because you can only query a limited number of tables using the Querying Multiple Tables Using a Wildcard Table method.

I suspect it is only possible using a programming language like Python, but wanted to check.

enter image description here

CArnold
  • 465
  • 4
  • 7
  • 16

2 Answers2

2

You can use:

SELECT * 
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*`

Or

SELECT * 
FROM `bigquery-public-data.google_analytics_sample.*` 
WHERE _TABLE_SUFFIX IN (SELECT table_id 
  FROM `bigquery-public-data.google_analytics_sample.__TABLES_SUMMARY__`
  WHERE table_id 
  LIKE 'ga_session_%')

To go through all the tables within the dataset. The limitations that may apply are for the query job ones (1000 tables referenced per query) and the current wildcard table queries limitations which doesn't limit a number of tables.

F10
  • 2,843
  • 2
  • 12
  • 18
0

The reason is because you can only query a limited number of tables using the Querying Multiple Tables Using a Wildcard Table method.

What limit are you talking about? I do not see such a limitation

Anyway, you can UNION several query results with wildcards, e.g.

SELECT * 
FROM `a:b.c.ga_sessions_*`
WHERE _TABLE_SUFFIX between '20180501' and '20180531' # MAY
UNION ALL 
SELECT * 
FROM `a:b.c.ga_sessions_*`
WHERE _TABLE_SUFFIX between '20180601' and '20180630' # JUNE
# and more ...
Sergiiko
  • 319
  • 1
  • 10