How do volatile variables work when using multithreading inside of Spark?
I have a multithreaded process that uses a volatile total
variable to keep track of a sum across multiple threads. This variable and all methods being executed are static
. I am curious as to how this variable would behave if I had multiple spark workers executing separate instances of this process in parallel.
Will each of them have their own total
variable or will it be shared across worker nodes?
EDIT: The reason I want to multithread and use spark is that my program is a Genetic Algorithm that flows as such: Distribute n
populations to Spark, ideally 1 population per worker. Each population has 10-100 "individuals." For each individual, calculate its fitness by running the multithreaded process 100 times (each iteration has a small parameter change) and return a function of the total of the iterations.
The multithreaded process takes a long time so I would like to speed it up in any way possible.