I am submitting Apache Spark job using spark-submit command. I want to retrieve application Id or Job Id of the job submitted using spark-submit command. What should be the recommended way?
Asked
Active
Viewed 7,626 times
3 Answers
4
Output of spark-submit command can be parsed to get the application id. This is the line you should be looking at -
2018-09-08 12:01:22 INFO StandaloneSchedulerBackend:54 - Connected to Spark cluster with app ID app-20180908120122-0001
appId=`./bin/spark-submit <options> 2>&1 | tee /dev/tty | grep -i "Connected to Spark Cluster" | grep -o app-.*[0-9]`
echo $appId
app-20180908120122-0001
Your use case is not clear but if you are looking for application id after job is completed then this could be helpful. This line may be different for yarn and other clusters.

Ajay Srivastava
- 1,151
- 11
- 15
-
thank you , what does this "tee /dev/tty" do here ? – BdEngineer Jun 05 '20 at 09:30
2
Since it's not clear if you want it programatically in the app, i'll assume you do, You can get the yarn application id or job id (in local mode) with the following,
val sparkSession: SparkSession = ???
val appID:String = sparkSession.sparkContext.applicationId
Hope this answers your question.

Chitral Verma
- 2,695
- 1
- 17
- 29
1
you can get Running Streaming Job By their UUID or query name
Like this : sparkSession.streams.active.get(UUID)
(where UUID is Job RunId)

David Buck
- 3,752
- 35
- 31
- 35

sahil
- 21
- 2