71

I've heard very contradictory things on how to best handle this, and am stuck with the following dilemma:

  • an OOME brings down a thread, but not the whole application
  • and I need to bring down the whole application but can't because the thread doesn't have any memory left

I've always understood best practice is let them go so the JVM can die because the JVM is in an inconsistent state at that point, but that doesn't seem to be working here.

djechlin
  • 59,258
  • 35
  • 162
  • 290
  • About all I can say is that handling an out of memory error is *very* difficult. Any handler you have must be careful to not create ANY new objects -- use pre-created objects (and beware of modifications to them that may do allocations). – Hot Licks Aug 23 '12 at 16:49

9 Answers9

66

In Java version 8u92 the VM arguments

  • -XX:+ExitOnOutOfMemoryError
  • -XX:+CrashOnOutOfMemoryError

were added, see the release notes.

ExitOnOutOfMemoryError
When you enable this option, the JVM exits on the first occurrence of an out-of-memory error. It can be used if you prefer restarting an instance of the JVM rather than handling out of memory errors.

CrashOnOutOfMemoryError
If this option is enabled, when an out-of-memory error occurs, the JVM crashes and produces text and binary crash files.

Enhancement Request: JDK-8138745 (parameter naming is wrong though JDK-8154713, ExitOnOutOfMemoryError instead of ExitOnOutOfMemory)

flavio.donze
  • 7,432
  • 9
  • 58
  • 91
  • 3
    Unfortunately this only works for "real" OutOfMemoryExceptions where the heap is exhausted. If the error occurs because native threads are exhausted, the ExitOnOutOfMemoryError and CrashOnOutOfMemoryError won't be triggered. See https://bugs.openjdk.java.net/browse/JDK-8155004 – radlan Jul 16 '18 at 09:33
63

OutOfMemoryError is just like any other error. If it escapes from Thread.run() it will cause thread to die. Nothing more. Also, when a thread dies, it is no longer a GC root, thus all references kept only by this thread are eligible for garbage collection. This means JVM is very likely to recover from OOME.

If you want to kill your JVM no matter what because you suspect it can be in an inconsistent state, add this to your java options:

-XX:OnOutOfMemoryError="kill -9 %p"

%p is the current Java process PID placeholder. The rest is self-explained.

Of course you can also try catching OutOfMemoryError and handling it somehow. But that's tricky.

Tomasz Nurkiewicz
  • 334,321
  • 69
  • 703
  • 674
  • 23
    Warning: an abrupt kill -9 may take down the process before your logs have flushed so it may look like a crash with no indication of what happened. Would be wise to use a chain of commands which try a polite shutdown "stop.sh %p" where your stop script can log that it is about to kill the process then try "kill -TERM $1", then sleep, and then do kill -9 last. That way you won't experience mystery crashes when your jvm commits suicide without logging what it was doing just before in the main logs. – simbo1905 May 15 '15 at 09:52
  • Unfortunately this only works for "real" OutOfMemoryExceptions where the heap is exhausted. If the error occurs because native threads are exhausted, the OnOutOfMemoryError won't be triggered. See https://bugs.openjdk.java.net/browse/JDK-8155004 – radlan Jul 16 '18 at 09:32
  • 1
    If the native threads are exhausted, there's another problem: the JVM would have to `fork()` in order to execute the kill command, but that would require another native thread. I found this little tool that might help: https://github.com/airlift/jvmkill – Roland Weber Mar 21 '19 at 08:14
37

With version 8u92 there's now a JVM option in the Oracle JDK to make the JVM exit when an OutOfMemoryError occurs:

From the release notes:

ExitOnOutOfMemoryError - When you enable this option, the JVM exits on the first occurrence of an out-of-memory error. It can be used if you prefer restarting an instance of the JVM rather than handling out of memory errors.

ahu
  • 1,492
  • 12
  • 11
  • 4
    Unfortunately this only works for "real" OutOfMemoryExceptions where the heap is exhausted. If the error occurs because native threads are exhausted, the ExitOnOutOfMemoryError won't be triggered. See https://bugs.openjdk.java.net/browse/JDK-8155004 – radlan Jul 16 '18 at 09:33
5

If you want to bring down your program, take a look at the -XX:OnOutOfMemoryError="<cmd args>;<cmd args>" (documented here) option on the command line. Just point it to a kill script for your application.

In general, I have never had any luck to gracefully handle this error without restarting the application. There was always some kind of corner case slipping through, so I personally suggest to indeed stop your application but investigate the source of the problem.

Marcus Riemer
  • 7,244
  • 8
  • 51
  • 76
5

You can force your program to terminate in multiple ways, once the error will ocurre. Like others have suggested, you can catch the error and do a System.exit after that, if needed. But I suggest you too use -XX:+HeapDumpOnOutOfMemoryError, this way the JVM will create a memory dump file with the content of your application once the event was produced. You will use a profiles, I recommend you Eclipse MAT to investigate the image. This way you will find pretty quickly what is the cause of the issue, and react properly. If you are not using Eclipse you can use the Eclipse MAT as a standalone product, see: http://wiki.eclipse.org/index.php/MemoryAnalyzer.

dan
  • 13,132
  • 3
  • 38
  • 49
  • Agree about the memory dump to analyse the cause. Disagree about to hardcode System.exit. In my opinion this strategy is dangerous, so if we choose it, I prefer to use options on the command line as the integrator will see them and be less surprised (and he will be able to change it). – mcoolive Mar 18 '15 at 10:01
3

Since the JVM options

-XX:+ExitOnOutOfMemoryError

-XX:+CrashOnOutOfMemoryError

-XX:OnOutOfMemoryError=...

don't work if the OutOfMemoryError occurs because of exhausted threads (see the corresponding JDK bug report), it may be worth trying the tool jkill. It registers via JVMTI and exits the VM if the memory or the available threads are exhausted.

In my tests it works as expected (and how I would expect the JVM options to work).

Yuri
  • 4,254
  • 1
  • 29
  • 46
radlan
  • 2,393
  • 4
  • 33
  • 53
2

I suggest handling all uncaught exceptions from within the application to ensure it tries to give you the best possible data before terminating. Then have an external script that restarts your process when it crashes.

public class ExitProcessOnUncaughtException implements UncaughtExceptionHandler  
{
    static public void register()
    {
        Thread.setDefaultUncaughtExceptionHandler(new ExitProcessOnUncaughtException());
    }

    private ExitProcessOnUncaughtException() {}


    @Override
    public void uncaughtException(Thread t, Throwable e) 
    {
        try {
            StringWriter writer = new StringWriter();
            e.printStackTrace(new PrintWriter(writer));
            System.out.println("Uncaught exception caught"+ " in thread: "+t);
            System.out.flush();
            System.out.println();
            System.err.println(writer.getBuffer().toString());
            System.err.flush();
            printFullCoreDump();
        } finally {
            Runtime.getRuntime().halt(1);
        }
    }

    public static void printFullCoreDump()
    {
        SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        System.out.println("\n"+
            sdf.format(System.currentTimeMillis())+"\n"+
            "All Stack Trace:\n"+
            getAllStackTraces()+
            "\nHeap\n"+
            getHeapInfo()+
            "\n");
    }

    public static String getAllStackTraces()
    {
        String ret="";
        Map<Thread, StackTraceElement[]> allStackTraces = Thread.getAllStackTraces();

        for (Entry<Thread, StackTraceElement[]> entry : allStackTraces.entrySet())
            ret+=getThreadInfo(entry.getKey(),entry.getValue())+"\n";
        return ret;
    }

    public static String getHeapInfo()
    {
        String ret="";
        List<MemoryPoolMXBean> memBeans = ManagementFactory.getMemoryPoolMXBeans();               
        for (MemoryPoolMXBean mpool : memBeans) {
            MemoryUsage usage = mpool.getUsage();

            String name = mpool.getName();      
            long used = usage.getUsed();
            long max = usage.getMax();
            int pctUsed = (int) (used * 100 / max);
            ret+=" "+name+" total: "+(max/1000)+"K, "+pctUsed+"% used\n";
        }
        return ret;
    }

    public static String getThreadInfo(Thread thread, StackTraceElement[] stack)
    {
        String ret="";
        ret+="\n\""+thread.getName()+"\"";
        if (thread.isDaemon())
            ret+=" daemon";
        ret+=
                " prio="+thread.getPriority()+
                " tid="+String.format("0x%08x", thread.getId());
        if (stack.length>0)
            ret+=" in "+stack[0].getClassName()+"."+stack[0].getMethodName()+"()";
        ret+="\n   java.lang.Thread.State: "+thread.getState()+"\n";
        ret+=getStackTrace(stack);
        return ret;
    }

    public static String getStackTrace(StackTraceElement[] stack)
    {
        String ret="";
        for (StackTraceElement element : stack)
            ret+="\tat "+element+"\n";
        return ret;
    }
}
Shloim
  • 5,281
  • 21
  • 36
  • This is at least an attempt to leave a helpful trace. If, however, this code runs into another OOM, I wonder if you get an infite exception handler recursion loop. – Harald Nov 10 '16 at 10:17
  • You can make sure it doesn't if you unset the default exception on the first line of the handler. – Shloim Nov 10 '16 at 11:00
1

Generally speaking you should never write a catch block that catches java.lang.Error or any of its subclasses including OutOfMemoryError. The only exception to this would be if you are using a third-party library who throws a custom subclass of Error when they should have subclassed RuntimeException. This is really just a work around for an error in their code though.

From the JavaDoc for java.lang.Error:

An Error is a subclass of Throwable that indicates serious problems that a reasonable application should not try to catch.

If you are having problems with your application continuing to run even after one of the threads dies because of an OOME you have a couple options.

First, you might want to check to see if it's possible to mark the remaining threads as daemon threads. If there is ever a point when only daemon threads remain in the JVM it will run all the shutdown hooks and terminate as orderly as possible. To do this you'll need to call setDaemon(true) on the thread object before it is started. If the threads are actually created by a framework or some other code you might have to use a different means to set that flag.

The other option is to assign an uncaught exception handler to the threads in question and call either System.exit() or if absolutely necessary Runtime.getRuntime().halt(). Calling halt is very dangerous as shutdown hooks won't even attempt to run, but in certain situations halt might work where System.exit would have failed if an OOME has already been thrown.

Mike Deck
  • 18,045
  • 16
  • 68
  • 92
0

You can surround your thread code with a try catch for the OOME and do some manual cleanup if such an event occurs. A trick is to make your thread function be only a try catch around another function. Upon memory error it should free some space up on the stack allowing you to do some quick deletes. This should work if you do a garbage collection request on some resources immediately after catching and/or to set a dying flag to tell other threads to quit.

Once the thread with OOME dies and you do some collection on it's elements, you should have more than enough free space for other threads to quit in an orderly fashion. This is a more graceful quit with an opportunity to log the problem before dying as well.

Pyrce
  • 8,296
  • 3
  • 31
  • 46
  • 1
    Careful here: this might work for simple apps, or maybe when there's one task which allocates huge objects and gets OOM'ed a lot. Otherwise, when you use any 3rd party frameworks, it will tend to be *their* threads that get hit by the OOME, not yours, and then the framework will be crippled, and your handler won't be called so your app will be zombified: not working, not exited. Generally it's not a reliable approach. – SusanW Jan 26 '21 at 17:09