I am trying to remove a manual step of adding an argument to my spark-submit by having my java spark application automatically calculate number of available cores to do partitions on. The hope was to identify a solution to do this programmatically.
I did look at this solution [SO question]Spark: get number of cluster cores programmatically, but am not sure how to do the "EncapsulationViolator" component to allow blockManager.master.getStorageStatus.length - 1
to work in Java. I have also tried sc.getExecutorStorageStatus.length - 1
to no avail. I was able to get number of cores via java.lang.Runtime.getRuntime.availableProcessors
, but number of nodes/workers/executors still eludes me.
Hoping someone has a suggestion on how to get number of executors beyond what has been suggested. I'm in spark 3.0 and writing in java