In which case would we like to have more than one executor in each worker?
Whenever possible: if jobs require less resources for one executor than what a worker node has, then spark should try to start other executors on the same worker to use all its available resources.
But that's the role of spark, not our call. When deploying spark apps, it is up to spark to decide how many executors (jvm process) are started on each worker node (machine). And it depends on the executor resources (core and memory) required by the spark jobs (the spark.executor.*
configs). We often don't know what resources are available per worker. A cluster is usually shared by multiple apps/people. So we configure executor number and required resources and let spark decide to run them on the same worker or not.
Now, your question is maybe "should we have less executors with lots of cores and memory, or distribute it in several little executors?"
Having less but bigger executors reduce shuffling, clearly. But there are several reasons to also prefer distribution:
- It is easier to start little executors
- Having big executor means the cluster need all required resources on one worker
- it is specially useful when using dynamic allocation, that will start and kill executor in function of runtime usage
- Several little executors improve resilience: if our code is unstable and might sometime kill the executor, everything is lost and restarted.
- I met a case where the code used in the executor wasn't thread safe. That's a bad thing, but it wasn't done on purpose. So until, or instead :\ , this is fixed, we distributed the code on many 1-core executors.