3

I'm having a problem monitoring a program using monit.

I'm running this on a raspberry pi, having built monit 5.11 from source; I tried using the version from the repositories, but it was 5.4 and didn't support some of syntax below that I want.

I'm trying to follow the "Q: I have a program that does not create its own pid file. Since monit requires all programs to have a pid file, what do I do?" entry in the FAQ.

Here's my start_sensors.sh script (which just runs my python program, instead of the java program in the wiki example):

#!/bin/bash

case $1 in
  start)
     echo $$ > /var/run/start_sensors.pid;
     exec 2>&1 /usr/bin/python /home/pi/temperature/post_temps.py 1>/tmp/post_temps.out
     ;;
   stop)
     kill `cat /var/run/start_sensors.pid` ;;
   *)
     echo "usage: start_sensors {start|stop}" ;;
esac
exit 0

Here's my /etc/monit/monitrc entry:

# Run temperature sensor monitor
check process start_sensors.sh with pidfile /var/run/start_sensors.pid
   start = "/home/pi/temperature/start_sensors.sh start"
   stop = "/home/pi/temperature/start_sensors.sh stop"

The output in the monit log looks like:

[EST Jan 24 14:21:16] info     : 'raspberrypi' Monit reloaded
[EST Jan 24 14:21:16] error    : 'start_sensors.sh' process is not running
[EST Jan 24 14:21:16] info     : 'start_sensors.sh' trying to restart
[EST Jan 24 14:21:16] info     : 'start_sensors.sh' start: /home/pi/temperature/start_sensors.    sh
[EST Jan 24 14:21:46] error    : 'start_sensors.sh' failed to start (exit status -1) --     Program /home/pi/temperature/start_sensors.sh timed out

So as you can see, monit starts up the program, it runs fine, and then monit kills it thirty seconds later due to the "timeout".

My program is running fine, and producing the proper output that I'm sending to the /tmp/post_temps.out file.

I don't understand why monit is timing the program out... it's supposed to be a long-running process!

I've tried changing the start_sensors.sh script so that it puts the program in the background (and has it write its own /var/run/start_sensors.pid file), but then monit starts a new instance up every thirty seconds or so, not stopping the old ones, and writing over the pid file. It's like it's not even looking at the pid file.

THANKS!

chris_st
  • 501
  • 7
  • 19

1 Answers1

8

The following works:

#!/bin/bash

case $1 in
  start)
     /usr/bin/python /home/pi/temperature/post_temps.py 1>/tmp/post_temps.out &
     echo $! > /var/run/start_sensors.pid ;
     ;;
   stop)
     kill `cat /var/run/start_sensors.pid` ;;
   *)
     echo "usage: start_sensors {start|stop}" ;;
esac
exit 0
chris_st
  • 501
  • 7
  • 19
  • 2
    For others who want to know why this works: I believe this works because the python script never exited (it was not a daemon), so you used the `&` at the end of the start command to start a new background process and saved that new process' PID instead of the script's PID (see http://stackoverflow.com/a/5163260/2544629 for a list of bash commands) – manroe Jul 31 '16 at 19:23
  • You just saved my life :) – Adim Feb 26 '18 at 15:53