I am broadcasting an RDD with collectAsMap. The input RDD size if around 5GB and we apply some filters before collect to map. But boardcasting fails after running for long time. I tried even with 3GB data and it fails. However ever I tried with 100KB of data, succeeds. Any settings that I am missing. I have tried to keep upper limits on driver memory and executor memory. Note: This is spark context broadcast and not spark sql
val data = sc.broadcast(RDD.collect.toMap)