We are running our SPARK application written in JAVA on below hardware :
- one Master node
- two Worker Nodes (Each with 502.5 GB available memory and 88 cores(CPUs)).
with following configuration for ./spark-submit
command :
--executor-memory=30GB --driver-memory=20G --executor-cores=5 --driver-cores=5
We are using SPARK cluster manager.
It takes 13 minutes to process 10 Million data.
We don't have liberty to share application code.
Can someone suggest configuration for tuning our application for better performance?
Let me know if you need any other detail.
We are using SPARK 2.3.0
EDIT
our data contains 127 columns and 10 million rows. spark started 32 executors with above configuration. we are making an external application call inside flatmap function.
do you think if hardware resources are not enough?