I am trying to call SPARK_SUBMIT using SPARK_SUBMIT_OPERATOR, I have to set SPARK_MAJOR_VERSION and HADOOP_USER_NAME before doing SPARK_SUBMIT. Can anyone help me with it ?
I am trying to run in YARN mode , I have passed env_vars. still SPARK_MAJOR_VERSION is not set.
INFO - [2019-03-11 21:07:03,525] {base_hook.py:83} INFO - Using connection to: id: spark_default. Host: yarn://XXXX, Port: 8088, Schema: None, Login: peddnade, Password: XXXXXXXX, extra: {u'queue': u'priority', u'namespace': u'default', u'spark-home': u'/usr/'}
[2019-03-11 21:07:03,526] {logging_mixin.py:95} INFO - [2019-03-11 21:07:03,526] {spark_submit_hook.py:283} INFO - Spark-Submit cmd: [u'/usr/bin/spark-submit', '--master', 'yarn:/XX:8088', '--conf', 'spark.dynamicAllocation.enabled=true', '--conf', 'spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1', '--conf', 'spark.app.name=RDM', '--conf', 'spark.yarn.queue=priority', '--conf', 'spark.shuffle.service.enabled=true', '--conf', 'spark.yarn.appMasterEnv.SPARK_MAJOR_VERSION=2', '--conf', 'spark.yarn.appMasterEnv.HADOOP_USER_NAME=ppeddnade', '--files', '/usr/hdp/current/spark-client/conf/hive-site.xml', '--jars', '/usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar', '--num-executors', '4', '--total-executor-cores', '4', '--executor-cores', '4', '--executor-memory', '5g', '--driver-memory', '10g', '--name', u'airflow-spark-example', '--class', 'com.hilton.eim.job.SubmitSparkJob', '--queue', u'priority', '/home/ppeddnade/XX.jar', u'XX']
[2019-03-11 21:07:03,542] {logging_mixin.py:95} INFO - [2019-03-11 21:07:03,542] {spark_submit_hook.py:415} INFO - Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
[2019-03-11 21:07:03,542] {logging_mixin.py:95} INFO - [2019-03-11 21:07:03,542] {spark_submit_hook.py:415} INFO - Spark1 will be picked by default