3

I am currently working with large table (~105M Records) in C# application.

  1. When query the table with 'Order by' or 'Order Each by' clause, then i am getting "Resources exceeded during query execution" error.

  2. If i remove 'Order by' or 'Order Each by' clause, then i am getting Response too large to return error.

Here is the sample query for two scenarios (I am using Wikipedia public table)

  1. SELECT Id,Title,Count(*) FROM [publicdata:samples.wikipedia] Group EACH by Id, title Order by Id, Title Desc

  2. SELECT Id,Title,Count(*) FROM [publicdata:samples.wikipedia] Group EACH by Id, title

Here are the questions i have

  1. What is the maximum size of Big Query Response?
  2. How do we select all the records in Query Request not in 'Export Method'?
selva kumar
  • 1,086
  • 2
  • 11
  • 30

1 Answers1

3

1. What is the maximum size of Big Query Response?

As it's mentioned on Quota-policy queries maximum response size: 10 GB compressed (unlimited when returning large query results)

2. How do we select all the records in Query Request not in 'Export Method'?

If you plan to run a query that might return larger results, you can set allowLargeResults to true in your job configuration.

Queries that return large results will take longer to execute, even if the result set is small, and are subject to additional limitations:

  • You must specify a destination table.
  • You can't specify a top-level ORDER BY, TOP or LIMIT clause. Doing so negates the benefit of using allowLargeResults, because the query output can no longer be computed in parallel.
  • Window functions can return large query results only if used in conjunction with a PARTITION BY clause.

Read more about how to paginate to get the results here and also read from the BigQuery Analytics book, the pages that start with page 200, where it is explained how Jobs::getQueryResults is working together with the maxResults parameter and int's blocking mode.

Update:

Query Result Size Limitations - Sometimes, it is hard to know what 10 GB of compressed data means.

When you run a normal query in BigQuery, the response size is limited to 10 GB of compressed data. Sometimes, it is hard to know what 10 GB of compressed data means. Does it get compressed 2x? 10x? The results are compressed within their respective columns, which means the compression ratio tends to be very good. For example, if you have one column that is the name of a country, there will likely be only a few different values. When you have only a few distinct values, this means that there isn’t a lot of unique information, and the column will generally compress well. If you return encrypted blobs of data, they will likely not compress well because they will be mostly random. (This is explained on the book linked above on page 220)

Pentium10
  • 204,586
  • 122
  • 423
  • 502
  • 1. Maximum Response Size: 128MB compressed. What is the meaning of compressed here, we are not using any of the compression logic for select statements? – selva kumar Jul 08 '15 at 06:43
  • 2. This Query **SELECT Id,Title,Count(*) FROM [publicdata:samples.wikipedia] Group EACH by Id, title** returns 19376810 (19M) records (Calculated by using Count method). It returns Response too large to return error in Web UI as well as .NET client API. You mentioned try to set **allowLargeResults** property, this property only able to set when we specify destination table which means basically copying query results into Another BigQuery table. I can able to do above step then new BigQuery table has 19376810 (19M) records, then how to we select those large records from new table using API cal? – selva kumar Jul 08 '15 at 07:10
  • @selvakumar as expalined in the linked article you paginate eg: you read 1000 records in each call, hence you need to do 19k calls to get all the data. Bad use case anyway. – Pentium10 Jul 08 '15 at 07:12
  • Now i have set MaxResults=1000 Records in my QueryRequest and Execute query by **QueryResponse response = j.Query(queryRequest, projectID).Execute();** , When it start querying (**Select * From DataSet.TableID**) it again throws **Response too large to return** error. If i am applying Limit 100000 Then the query gets running and able to get the results with 100 Calls. Why Google bigquery throws error for large result instead of returning 1000 records? – selva kumar Jul 08 '15 at 07:40
  • 1
    You need to set `allowLargeResults` on the job configuration level, and you need to set a destination table. That's what you miss. – Pentium10 Jul 08 '15 at 07:57
  • Yes i understand the flow now, I need to execute the Job with my query and set **AllowLargeResults=true**, then need to get response using **GetQueryResultsRequest** method (I need to get paginated results using **PageToken**). – selva kumar Jul 08 '15 at 09:57
  • The query maximum response size was increased from 128 MB to 10 GB https://cloud.google.com/bigquery/quotas#queries – MonicaPC Aug 01 '19 at 02:43