I have build a Spark and Flink k-means application. My test case is a clustering on 1 million points on a 3 node cluster.
When in-memory bottlenecks begin, Flink starts to outsource to disk and work slowly but works. However, Spark lose executers if the memory is full and starts again (infinite loop?).
I try to customize the memory setting with the help from the mailing list here, thanks. But Spark does still not work.
Is it necessary to have any configurations to be set? I mean Flink works with low memory, Spark must also be able to; or not?