Is their any formula to calculate the number of executors in spark job depending on the input file size.Or can we launch the no. of executors based on the numbers of hdfs blocks of the data. And the second question is can we launch two executors in the same node for the same spark job?
1 Answers
Actually, number of executors is not related to number and size of the files you are going to use in your job. Number of executors is related to the amount of resources, like cores and memory, you have in each worker. There is some rule of thumbs that you can read more about at first link, second link and third link.
But as an advice, usually, it would brings a much better performance if you set more that one executor on a worker. to find out the reason take a look at
There is a link between two executors in worker nodes. In other words, there is a interaction between different worker nodes while your job is running of cluster nodes. So, if you can have more than one executors on a worker node, you would decrease the network overhead for this type of communication. Moreover, you would have much more better resource utilization. If you obey correctly the above links about number of executors and you implement optimize, you would experience a wonderful running of a spark job with high performance.

- 881
- 1
- 9
- 20