1

I was running a big overnight batch program last night written in Java on a Linux based server. I can't seem to find anything in my error logs that suggests an error was encountered in my Java application.

Is there a way in Linux to see if a program exited unexpectedly?

The program is one of many programs that get run overnight off a chronjob/tab and runs off its own main method. It catches a series of exceptions which prints messages to System.err.println and exits with status one if these are hit.

NB: I always use a Logger in my code unfortunately I'm dealing with legacy code written by someone else.

Alexei Blue
  • 1,762
  • 3
  • 21
  • 36
  • 1
    are you stared job using crontab ? – nidhin Nov 21 '11 at 09:55
  • Yes the update does run from chronjob/chrontab, it runs every night and it seems last night for some reason it failed i.e. exited suddenly. The batch job its self is used to synchronise two massive tables to ensure that any data changes during the day are backed up, and it appears the data has not been synchronised which how I know it failed. – Alexei Blue Nov 21 '11 at 10:24

5 Answers5

2

If Java crashed there will be a hs_err_pid????.log file in the working directory of the application by default. (This is unlikely to be the case)

If the application logged an error before exiting, you need to understand where your application places its logs and read those (as they can be anywhere on your system)

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • Well the code writes error messages out to System.err when an exception is hit which is the standard error log location unless otherwise changed. It doesn't appear to be set in the code to anywhere else and like most of the batch jobs this one runs it's own main method and so only one class is involved. – Alexei Blue Nov 21 '11 at 10:34
  • So you need to find where System.err was redirected. Normally a specific file is given e.g. `2> mayapp.err.log` If you are not using a tool like crontab, it will have been written to the screen or discarded. – Peter Lawrey Nov 21 '11 at 11:40
2

There's no easy mechanism to discover what you're after, if whatever tool you used to start the java JVM didn't bother recording the exit status for you.

If you're running the auditd(8) server to provide audit logging, and your auditd(8) is configured to log abnormal exits and your java JVM exited abnormally -- signal-based termination -- then you can look for ANOM_ABEND events in /var/log/audit/audit.log:

# ausearch -m ANOM_ABEND
/sbin/audispd permissions should be 0750
----
time->Tue Nov  8 18:42:22 2011
type=ANOM_ABEND msg=audit(1320806542.571:264): auid=4294967295 uid=1000 gid=1000 ses=4294967295 pid=11955 comm="regex" sig=11
...

For future executions you might want to do something like this:

java /path/to/whatever.jar && echo `date` >> /path/to/dir/success || echo `date` >> /path/to/dir/failure

This will echo the date of success or failure into a log file -- assuming that your application uses the standard Unix-style exit(0) for success and anything else for failure.

sarnold
  • 102,305
  • 22
  • 181
  • 238
  • Thanks Sarnold, definitely something I have written down an will probably use for these batch job updates that I have that each have their own main methods and are run separately and not continuously. – Alexei Blue Nov 21 '11 at 10:42
1

it is the virtual machine process (named java) whose termination status you want to check. you can write a trivial script with 2 commands, the first invokes the java vm to run the java program and the second records the exit status: echo $?

necromancer
  • 23,916
  • 22
  • 68
  • 115
1

If you did write the application, you should use a logger that writes to a file.

See this tutorial how to use Log4j with a file appender. In your code, you need to catch and log exceptions.

See this issue.

Community
  • 1
  • 1
Stephan
  • 4,395
  • 3
  • 26
  • 49
  • I agree with you on this 100% and in the code I write I do use Logger.log which points to a properties file and holds constants for an APP_LOG, ERROR_LOG and INFO_LOG. Unfortunately I'm dealing legacy code and the guy who wrote it, possibly about 10 years ago, used System.err to print errors in catch statements. Thanks for the links, I will read them a little later on, they will be useful as I go through and refactor :) – Alexei Blue Nov 21 '11 at 10:38
  • You can also redirect the system.err and system.out streams to log4j: http://stackoverflow.com/questions/5712764/redirect-system-out-println-to-log4j-while-keeping-class-name-information You should do this only for debugging, I think it slows down the output a lot. – Stephan Nov 21 '11 at 10:43
1

Because you've run your programs out of cron(8), there's a good chance that the standard error of the program has in fact been captured and mailed somewhere.

Check the crontab(5) for the user account that runs the program. (If it is run out of /etc/crontab or /etc/cron.d/, then in those files.) Look for the MAILTO variable. If it doesn't exist, then cron(8) tried to deliver mail to the crontab(5) owner. If it does exist, then cron(8) tried to deliver mail to whoever is specified with the variable.

Look in /var/spool/mail/ for the user's mailbox, if the server doesn't seem like it's got an email setup in place -- there might be enough for local delivery.

sarnold
  • 102,305
  • 22
  • 181
  • 238
  • Cheers Sarnold, we managed to track the email to an old employee email which unfortunately we don't have access too but now we are reviewing a lot of changes and putting log comments in place to capture errors in the batch jobs :) – Alexei Blue Dec 08 '11 at 11:12