9

I started a yarn cluster mode spark job through spark-submit. To indicate partial failure etc I want to pass exitcode from driver to script calling spark-submit.

I tried both, System.exit and throwing SparkUserAppException in driver, but in both cases CLI only got 1, not what exitcode I passed.

I think it is impossible to pass custom exitcode, since any exitcode passed by driver will be converted to yarn status and yarn will convert any failed exitCode to 1 or failed.

Zxcv Mnb
  • 733
  • 8
  • 19
  • Could you tell me the command that you used to submit the job? – code Jan 02 '17 at 10:36
  • $SPARK_HOME/bin/spark-submit --verbose .... --master yarn --deploy-mode cluster .... . I use spark-2.0.0 with hadoop 2.3. Any particular option you are looking for ? – Zxcv Mnb Jan 02 '17 at 11:31
  • I think --deploy-mode client would help you. Or at least with some hack, you should be able to achieve what you need. – code Jan 02 '17 at 11:40
  • I cannot use client mode, if it is not possible in yarn cluster mode, I would just like someone to confirm that. – Zxcv Mnb Jan 02 '17 at 12:14

2 Answers2

1

By looking at spark code, I can conclude this:

It is possible in client mode. Look at runMain() method of SparkSubmit class

Whereas in cluster mode, it is not possible to get the exit status of the driver because your driver class will be running in one of the executors.

There an alternate solution that might/might not be suitable for your use case:

Host a REST API with an endpoint to receive the status update from your driver code. In the case of any exceptions, let your driver code use this endpoint to update the status.

code
  • 2,283
  • 2
  • 19
  • 27
  • spark-submit will still return yarn exitcode to cli. So we need a way for driver to directly let cli know the exitCode. There are couple of ugly ways to do that. :P – Zxcv Mnb Jan 03 '17 at 10:22
  • Yes there are. May I know why you cannot use client mode? – code Jan 03 '17 at 13:15
  • cluster mode is recommended for production use. On nodes with non-uniform config, IMHO cluster mode takes care that driver runs on node which has required RAM, CPU. – Zxcv Mnb Jan 04 '17 at 08:08
1

You can save the exit code in the output file (on HDFS or local FS) and make your script wait for this file appearance, read and proceed. This is definitely is not an elegant way, but it may help you to proceed. When saving file, pay attention to the permissions to this location. Your spark process has to have RW access.

Oleg Hmelnits
  • 114
  • 1
  • 10