11

How Do i stop spark streaming? My spark streaming job is running continuously. I want to stop in a graceful manner.

I have seen below option to shutdown streaming application.

sparkConf.set("spark.streaming.stopGracefullyOnShutdown","true") 

Spark configuration: available properties

But, how do i update this parameter on a running application?

Utgarda
  • 686
  • 4
  • 23
AKC
  • 953
  • 4
  • 17
  • 46
  • 3
    You cannot set sparkConf of sparkcontext after the creation of sparkContext. – Knight71 Oct 12 '16 at 06:24
  • What do you mean when you say graceful? Is anything wrong happening when your app stops? – Amit Kumar Oct 12 '16 at 07:03
  • I want to stop the application manully. there are two scenarios. i am clear about how to stop it when some error happend, i have it in the code. but if i want to stop manually, i am looking for a mechanism. – AKC Oct 12 '16 at 15:02
  • Does this answer your question? [How do I stop a spark streaming job?](https://stackoverflow.com/questions/32582730/how-do-i-stop-a-spark-streaming-job) – Tom Zych Apr 06 '20 at 10:12

1 Answers1

18

Have a look at this blogpost. It it the "nicest" way to gracefully terminate a streaming job I have come across.

How to pass Shutdown Signal :

Now we know how to ensure graceful shutdown in spark streaming. But how can we pass the shutdown signal to spark streaming. One naive option is to use CTRL+C command at the screen terminal where we run driver program but obviously its not a good option. One solution , which i am using is , grep the driver process of spark streaming and send a SIGTERM signal . When driver gets this signal, it initiates the graceful shutdown of the application. We can write the command as below in some shell script and run the script to pass shutdown signal :

ps -ef | grep spark | grep | awk '{print $2}' | xargs kill -SIGTERM

e.g. ps -ef | grep spark | grep DataPipelineStreamDriver | awk '{print $2}' | xargs kill -SIGTERM

Glennie Helles Sindholt
  • 12,816
  • 5
  • 44
  • 50
  • do i need to set sparkConf.set(“spark.streaming.stopGracefullyOnShutdown","true") before i run the above command? – AKC Oct 12 '16 at 15:02
  • 1
    Yes, you need the setting as well :) But please read the full blogpost. – Glennie Helles Sindholt Oct 12 '16 at 18:43
  • Got it.. When this is enables, how to gracefully shutdown with in my code if some exception occurs. – AKC Nov 01 '16 at 20:36
  • Blogpost says - We can just set this parameter, and then call methods ssc.start() and ssc.awaitTermination() . No need to call ssc.stop method. Otherwise application might hung during shutdown. – AKC Nov 01 '16 at 20:37
  • If i can't use stop method in code, how do i stop it gracefully incase of exception. – AKC Nov 01 '16 at 20:38
  • 1
    I tried it this way: ps -ef | grep spark | grep driver-20161101205113-0016 | awk '{print $2}' | xargs kill -SIGTERM and had the paramter true. When i run this command on linux terminal, i am receving. usage: kill [ -s signal | -p ] [ -a ] pid ... kill -l [ signal ] – AKC Nov 01 '16 at 20:59