I am using Spark Streaming v2.0.0 to retrieve logs from Kafka and to do some manipulation. I am using the function mapWithState
in order to save and update some fields related to a device. I am wondering how this function works in cluster. Indeed, i am just using the standalone mode so far but I will try it later with a Yarn cluster.
However, let's say I have a cluster with several nodes, if a node updates the sate of a device, does he notify immediately all other nodes of this update ? If no, the mapWithState
function in cluster needs to be set. And how can I do that ?