When I run a spark job written with pyspark, I get a jvm running which has an Xmx1g
setting I cannot seem to set. Here is ps aux
output:
/usr/lib/jvm/jre/bin/java -cp /home/ec2-user/miniconda3/lib/python3.6/site-packages/pyspark/conf:/home/****/miniconda3/lib/python3.6/site-packages/pyspark/jars/* -Xmx1g org.apache.spark.deploy.SparkSubmit pyspark-shell
My question is, how do I set this property? I can set the master memory by using SPARK_DAEMON_MEMORY
and SPARK_DRIVER_MEMORY
but this doesn't affect pyspark's spawned process.
I already tried JAVA_OPTS
or actually looking at the packages /bin
files but couldn't understand where this is set.
Setting spark.driver.memory
and spark.executor.memory
in the job context itself didn't help as well.
Edit:
After moving to submit jobs with spark-submit (the code and infrastructure were evloved from standalone configuration) - everything was resolved. Submitting programmatically (using SparkConf
) seems to override some of the cluster's setup.