How to set row batch size for incrementalCollect in Apache Spark Thrift server?

Asked Sep 21 '21 at 13:24

Active Sep 27 '21 at 11:57

Viewed 195 times

I enabled spark.sql.thriftServer.incrementalCollect in my Thrift server (Spark 3.1.2) to prevent OutOfMemory exceptions. This worked fine, but my queries are really slow now. I checked the logs and found that Thrift is querying batches of 10.000 rows.

INFO SparkExecuteStatementOperation: Returning result set with 10000 rows from offsets [1260000, 1270000) with 169312d3-1dea-4069-94ba-ec73ac8bef80

My hardware would be able to handle 10x-50x of that. This issue and this documentation page suggest setting spark.sql.inMemoryColumnarStorage.batchSize, but that didn't work.

Is it possible to configure the value?

edited Sep 27 '21 at 11:57

asked Sep 21 '21 at 13:24

fokoenecke

How to set row batch size for incrementalCollect in Apache Spark Thrift server?

0 Answers0