3

In spark job. I am using if file not found the system.exit(0). It should gracefully complete the job. Locally It is successfully completed. But when I am running on EMR. Step is failing.

manisha
  • 31
  • 3

1 Answers1

7

EMR uses YARN for cluster management and launching Spark applications. So when you're running a Spark app with --deploy-mode: cluster in EMR, the Spark application code is not running in a JVM on its own, but is rather executed by the ApplicationMaster class.

Browsing through the ApplicationMaster code can explain what happens when you try to execute System.exit(). The user application is launched in startUserApplication, and then the finish method is called after the user application returns. However, when you call System.exit(0), what is executed instead is a shutdown hook which sees that your code hasn't finished successfully, and marks it as an EXIT_EARLY failure. It also has this useful comment:

  // The default state of ApplicationMaster is failed if it is invoked by shut down hook.
  // This behavior is different compared to 1.x version.
  // If user application is exited ahead of time by calling System.exit(N), here mark
  // this application as failed with EXIT_EARLY. For a good shutdown, user shouldn't call
  // System.exit(0) to terminate the application.
Ivan Vergiliev
  • 3,771
  • 24
  • 23
  • is their a proper way to trigger a job complete ? other than just waiting for the `main` to return nothing ? – Wonay Mar 19 '20 at 21:24
  • 2
    You'd probably need to find a way to return from the main method when you want to complete the job. For example, a somewhat hacky way could be to throw an exception from the completion point you want, catch that exception in `main`, and then exit gracefully there. There might be better ways depending on how your code is structured. – Ivan Vergiliev Mar 20 '20 at 15:11
  • Thank you Ivan. Is there a way to "catch" a system.exit ? If a library is calling it and I need to bypass it. – Wonay Mar 21 '20 at 21:10
  • 1
    Not that I know of. The only approach I’m aware of is by using a SecurityManager - however, this is what YARN already does, and it seems like it’s not possible to have multiple security managers registered in the same Java application. There’s a general discussion in https://stackoverflow.com/questions/5401281/preventing-system-exit-from-api , but that seems to be the only approach mentioned there too. – Ivan Vergiliev Mar 23 '20 at 22:37