Can the GBQ team share more about why the "Query too large" error might be popped? And also more workarounds for the problem I experienced: In particular, I will give more details of what I was doing when it popped and some of the resolutions that never sufficed to appease the Query-too-large gods. I was doing a rather long comma join like
select fields from A_1, A_2, ..., A_15, where each had this many records:
1 - 41854
2 - 32287
3 - 16876
4 - 1799
5 - 3112
6 - 6412
7 - 6424
8 - 7286
9 - 14832
10 - 17167
11 - 51149
12 - 3895
13 - 8139
14 - 38395
15 - 22858
A_4 - A_8 were previously one query and needed to be broken up or the original result table would also result in the same error. Similarly for A_12 - A_15. (I did not optimize this partitioning for minimal number of A_i's, I just broke up the originals according to date partitions coming from the application.)
The queries producing the A_i's, i=1,...,15, are pared down in terms of fields and aggregation. I.e. I am only drawing necessary fields and I am aggregating as much as the application allows (considering even thoughtful, clever reductions). This still popped the error.
The next step was to aggregate away important information. This finally worked by reducing each of the A_i sizes, but at the expense of an important view into the data.
I understand that unioning tables might be the source of the problem (see Getting "Query too large" in BigQuery for example), if that is what table_range or table_date_range() is doing behind the scenes. I have only table_date_range()s over dates such that the table_date_range()s work and table_query()s. Does this mean a comma-join is doing something similar that has a similar limitation?
Insights? Why is this popped precisely? (Is the language in the error meaningful? Is the code for the query too long?) Are there fixes in the works? Thanks so much!