I am going through Spark Programming guide that says:
Broadcast variables allow the programmer to keep a read-only variable cached on each machine rather than shipping a copy of it with tasks.
Considering the above, what are the use cases of broadcast variables? What problems do broadcast variables solve?
When we create any broadcast variable like below, the variable reference, here it is broadcastVar
available in all the nodes in the cluster?
val broadcastVar = sc.broadcast(Array(1, 2, 3))
How long these variables available in the memory of the nodes?