I'd like to understand the internals of Spark's FAIR scheduling mode. The thing is that it seems not so fair as one would expect according to the official Spark documentation:
Starting in Spark 0.8, it is also possible to configure fair sharing between jobs. Under fair sharing, Spark assigns tasks between jobs in a “round robin” fashion, so that all jobs get a roughly equal share of cluster resources. This means that short jobs submitted while a long job is running can start receiving resources right away and still get good response times, without waiting for the long job to finish. This mode is best for multi-user settings.
It seems like jobs are not handled equally and actually managed in fifo order.
To give more information on the topic:
I am using Spark on YARN. I use the Java API of Spark. To enable the fair mode, The code is :
SparkConf conf = new SparkConf();
conf.set("spark.scheduler.mode", "FAIR");
conf.setMaster("yarn-client").setAppName("MySparkApp");
JavaSparkContext sc = new JavaSparkContext(conf);
Did I miss something?