I have broadcasted a table which is less than 2GB, the explain plan anyway shows a sort merge join on the table. Cab auto-broadcast turn off for some reason?
Asked
Active
Viewed 158 times
0
-
this exceeds the upper limit of 10MB which is the default value, please check this post https://stackoverflow.com/questions/41045917/what-is-the-maximum-size-for-a-broadcast-object-in-spark – abiratsis Sep 11 '20 at 08:47
-
if your cluster have huge resources like RAM and disk and only one application run at a time. then tune spark spark.sql.autobroadcastjointhreshold value to 2GB ... i would personally won't brodcast such huge data ... it has huge overhead and decrease task performance overall. – kavetiraviteja Sep 12 '20 at 07:58
-
Thank you for your inputs! The issue was caused by an empty table. – Sumi Aug 06 '21 at 19:21