Some distributed computing engines such as Spark or Flink are able to distribute code between computers and jvm, such as (in scala with spark):
sc.parallelize(1 to 10).map(i => i+1).collect
Here, the i => i+1
is serialized, send and executed on all worker. I would like to know how this is done?
Also I'd appreciate if anyone can point me to the source code (classes) that are related to this issue in some existing distributed-computing framework such as Spark/Flink