1

When broadcasting, Spark can fail with the error org.apache.spark.sql.errors.QueryExecutionErrors#notEnoughMemoryToBuildAndBroadcastTableError (Spark 3.2.1):

enter image description here

Why BroadcastExchange needs more driver memory? Isn't broadcast sending data to all drivers? Why driver memory is a bottleneck?

Thanks.

YFl
  • 845
  • 7
  • 22
  • yes but the broadcasted variable is sent from the driver to all workers. it should be serialized on driver first. – Young Oct 20 '22 at 12:11
  • even in a broadcast join? makes little sense to me to bring the whole dataframe to the driver.. – YFl Oct 20 '22 at 12:15
  • only driver communicate to all workers, a worker cannot sent anything to another worker imo. So anything shared to all workers go through the driver. – Young Oct 20 '22 at 12:19
  • Unfortunately that's the case until executor side broadcast joins are implemented (see SPARK-17556) – Moritz Oct 20 '22 at 12:20
  • ahh ohhh.. that's so eye-opening! Thank you all. – YFl Oct 20 '22 at 12:21
  • Btw, I can recommend this talk on the topic https://www.youtube.com/watch?v=B9aY7KkTLTw – Moritz Oct 20 '22 at 12:23

1 Answers1

3

Unfortunately executor side broadcast joins are not yet supported in Spark (see SPARK-17556). Currently all data of the broadcasted dataset is collected in the driver first to build an in-memory hash table which is then distributed to workers. This can result in high memory pressure on the driver.

Moritz
  • 895
  • 4
  • 8