0

I have a jar that when run, goes through the files in a directory and processes 10 of them before exiting.

I have a shell script that looks something like this:

while true;
do java -jar myjar.jar
sleep2;
done

I have another shell script that runs the previous one on startup like so:

nohup loopscript.sh > /var/log/error.log

The problem is that sometimes the jar crashes when it needs more memory than the system has, and the entire loop seems to stop running. My log file ends with a stack trace when the memory cap is hit.

How can I reliably restart the loop after a crash? I read elsewhere on SO to do something like

until myserver; do
    echo "Server 'myserver' crashed with exit code $?.  Respawning.." >&2
    sleep 1
done

But this only works if myserver is itself in a loop, and I'm intentionally halting the jar after 10 runs to force garbage collection and reduce the chance of a crash midway. Is my logic flawed? Should I just put the jar into a loop and use the above method of restarting it when it crashes?

Community
  • 1
  • 1
Ben
  • 1,292
  • 9
  • 17
  • Did you limit yourself to 10 files per run because of memory? This sounds suspiciously like a bad memory leak in your Java program. – Jim Garrison Feb 09 '12 at 00:48
  • The program doesn't appear to leak -- i've tested it on thousands of files in a row. I just had a hunch that it was cheaper to quit it and restart it every few seconds rather than keep it running perpetually. Am I misguided? – Ben Feb 09 '12 at 01:06

1 Answers1

0

As a quick and dirty solution, you can kill the process after some timeout. Here are two scripts in a parent-child relationship:

  • b.sh - parent

    echo Parent running
    while true; do
      ./a.sh &
      pid=$!
      echo Child running as $pid
      sleep 2
      if [ "`ps -p $pid`" != "" ]; then
        sh -c "/bin/kill $pid" >/dev/null 2>&1 
        echo Killed $pid
      fi
    done
    
  • a.sh - child

    echo Child running
    seconds=$RANDOM
    let "seconds %= 4"
    sleep $seconds
    echo Child finished
    

However, as @Jim Garrison notes, it's probably much better to design your app to run correctly, whatever that means in your case. This way, you can actually improve your app and see why you need that much memory. You'll probably solve some cases which will pop up in the future, but are not visible because you are just "solving" the problem by restarting.

It's like playing Russian roulette - yes, you may get lucky 20 times in a row, but it's going to happen...

icyrock.com
  • 27,952
  • 4
  • 66
  • 85
  • The fix will work for now, but I am beginning to see that ironing out the existing issues is going to be the best long-term solution. Thanks for the answer! – Ben Feb 09 '12 at 02:43
  • Sure thing Ben - yes, I thing it will most likely pay off to investigate the issues instead of circumventing them like this. Depends on the project, but if it's going to be used for any reasonable amount of time, you will usually waste way more time (and nerves) patching then solving properly. – icyrock.com Feb 10 '12 at 03:45