2

Can the GBQ team share more about why the "Query too large" error might be popped? And also more workarounds for the problem I experienced: In particular, I will give more details of what I was doing when it popped and some of the resolutions that never sufficed to appease the Query-too-large gods. I was doing a rather long comma join like

select fields from A_1, A_2, ..., A_15, where each had this many records:

1 - 41854
2 - 32287
3 - 16876
4 - 1799
5 - 3112
6 - 6412
7 - 6424
8 - 7286
9 - 14832
10 - 17167
11 - 51149
12 - 3895
13 - 8139
14 - 38395
15 - 22858

A_4 - A_8 were previously one query and needed to be broken up or the original result table would also result in the same error. Similarly for A_12 - A_15. (I did not optimize this partitioning for minimal number of A_i's, I just broke up the originals according to date partitions coming from the application.)

The queries producing the A_i's, i=1,...,15, are pared down in terms of fields and aggregation. I.e. I am only drawing necessary fields and I am aggregating as much as the application allows (considering even thoughtful, clever reductions). This still popped the error.

The next step was to aggregate away important information. This finally worked by reducing each of the A_i sizes, but at the expense of an important view into the data.

I understand that unioning tables might be the source of the problem (see Getting "Query too large" in BigQuery for example), if that is what table_range or table_date_range() is doing behind the scenes. I have only table_date_range()s over dates such that the table_date_range()s work and table_query()s. Does this mean a comma-join is doing something similar that has a similar limitation?

Insights? Why is this popped precisely? (Is the language in the error meaningful? Is the code for the query too long?) Are there fixes in the works? Thanks so much!

Community
  • 1
  • 1
JBarrera
  • 56
  • 4
  • 1
    If you want the BQ team to look at your particular situation, share the job id. In the meantime my reply from the post you saw remains - if you have too many tables, try unioning all that data in only one. (the number of records is irrelevant for this error) – Felipe Hoffa Oct 15 '15 at 06:29
  • @Felipe, I guess I do not follow. I thought the comma-join was essentially union a la http://stackoverflow.com/questions/10644993/support-union-function-in-bigquery-sql – JBarrera Oct 15 '15 at 16:38
  • Yes, comma-join is an union. 'Query too large' might mean that you had too many tables in that list. The problem here is not the number of tables, but how many fit in the max length for an expanded query (as a string). Can't tell much more without more details. – Felipe Hoffa Oct 15 '15 at 18:55

0 Answers0