23

An OOME is of the class of errors which generally you shouldn't recover from. But if it is buried in a thread, or someone catches it, it is possible for an application to get in a state from which it isn't exiting, but isn't useful. Any suggestions in how to prevent this even in the face of using libraries which may foolishly try to catch Throwable or Error/OOME? (ie you don't have direct access to modify the source code)

Michael Neale
  • 19,248
  • 19
  • 77
  • 109
  • 3
    Why shouldn't you recover from it? An OOME is not something that happens through a programming error (like a null pointer or illegal argument) but because on an unforeseen situation at runtime, which a really stable application should try to survive. Of course, it would be wise to notify admins if it happens (in a server app) so they can look into it. – Bart van Heukelom Oct 06 '10 at 09:56
  • 2
    @Bart - I am pretty sure what you suggest is exactly what people should NOT do (other than very exceptional circumstances). You can read the java docs on it for more details. – Michael Neale Oct 06 '10 at 21:40
  • Don't work with that kind of people? – Raedwald Jul 08 '13 at 12:45
  • @BartvanHeukelom see http://stackoverflow.com/questions/333736/is-out-of-memory-a-recoverable-error – Raedwald Jul 08 '13 at 12:58

10 Answers10

36

Solution:

On newer JVMs:

-XX:+ExitOnOutOfMemoryError
to exit on OOME, or to crash:

-XX:+CrashOnOutOfMemoryError

On Older:

-XX:OnOutOfMemoryError="<cmd args>; <cmd args>"

Definition: Run user-defined commands when an OutOfMemoryError is first thrown. (Introduced in 1.4.2 update 12, 6)

See http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html

An example that kills the running process:

-XX:OnOutOfMemoryError="kill -9 %p"
Michael Neale
  • 19,248
  • 19
  • 77
  • 109
  • Hmm ... that might work, depending on what you want the commands to do. Reporting errors would be fine, but `kill -9`-ing the JVM might have nasty side-effects. – Stephen C Oct 07 '10 at 02:47
  • Well you could SIGTERM and wait... and if it doesn't die then kill. – Michael Neale Nov 04 '11 at 02:44
  • 2
    If you run on a unix like system, you could use this command https://gist.github.com/xylifyx/5865113 in your -XX:OnOutOfMemoryError="killparent" – Erik Martino Jun 26 '13 at 06:17
  • 2
    As an alternative to compiling the killparent program, one might consider using the PPID shell variable, like that `-XX:OnOutOfMemoryError="kill -9 $PPID"`. – Po' Lazarus Jul 05 '13 at 07:39
  • 3
    As of [Java 8u92](http://www.oracle.com/technetwork/java/javase/8u92-relnotes-2949471.html) you can use -XX:+ExitOnOutOfMemoryError or -XX:+CrashOnOutOfMemoryError – Dennie Jan 24 '18 at 11:34
8

If some piece of code in your application's JVM decides that it wants to try to catch OOMEs and attempt to recover, there is (unfortunately) nothing you that you can do to stop it ... apart from AOP heroics that are probably impractical, and definitely are bad for your application's performance and maintainability. Apart from that, the best you can do is to pull the plug on the JVM using an "OnOutOfMemoryError" hook. See the answer above: https://stackoverflow.com/a/3878199/139985/

Basically, you have to trust other developers not to do stupid things. Other stupid things that you probably shouldn't try to defend against include:

  • calling System.exit() deep in a library method,
  • calling Thread.stop() and friends,
  • leaking open streams, database connections and so on,
  • spawning lots of threads,
  • randomly squashing (i.e. catching and ignoring) exception,
  • etc.

In practice, the way to pick up problems like this in code written by other people is to use code quality checkers, and perform code reviews.

If the problem is in 3rd-party code, report it as a BUG (which it probably is) and if they disagree, start looking for alternatives.


For those who don't already know this, there are a number of reason why it is a bad idea to try to recover from an OOME:

  1. The OOME might have been thrown while the current thread was in the middle of updating some important data structure. In the general case, the code that catches this OOME has no way of knowing this, and if it tries to "recover" there is a risk that the application will continue with a damages data structure.

  2. If the application is multi-threaded there is a chance that OOMEs might have been thrown on other threads as well, making recovery even harder.

  3. Even if the application can recover without leaving data structures in an inconsistent state, the recovery may just cause the application to limp along for a few seconds more and then OOME again.

  4. Unless you set the JVM options appropriately, a JVM that has almost run out of memory tends to spend a lot of time garbage collecting in a vain attempt to keep doing. Attempting to recover from OOMEs is likely to prolong the agony.

Recovering from an OOME does nothing to address the root cause which is typically, a memory leak, a poorly designed (i.e. memory wasteful) data structure, and/or launching the application with a heap that is too small.

Community
  • 1
  • 1
Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • 1
    I think the OP has a clear notion that catching OOME is a bad idea; he probably just ran into a case where this happens and wants to make the whole system shutdown rather than just continue crippled. Always good to highlight these points, anyhow :) – biasedbit Oct 06 '10 at 10:21
  • well it can be more innocent - thread t1 gets an OOME - and dies. Thread t2 - which doesn't do anything requiring heap keeps running, preventing the JVM from exiting. – Michael Neale Oct 06 '10 at 10:23
  • @bruno - 5 minutes with `find` and `grep` should be sufficient to find the offending code. (I'm assuming he has source code access. If he doesn't it might take a bit longer ... using a decompiler.) – Stephen C Oct 06 '10 at 10:25
  • Yes - exactly - I do NOT want to recover from OOMEs - its just that I have noticed that people sometimes, deliberately or other, in various libraries, will make mistakes which prevent OOMEs from percolating up. In multi threaded it is easy to do (I am being nice here !). – Michael Neale Oct 06 '10 at 10:25
  • @Michael - yes, that's a problem. But maybe the solution to that would be to set a default uncaught exception handler that detects `Error`s and metaphorically pulls the plug on the JVM. – Stephen C Oct 06 '10 at 10:27
  • (probably redundant, but adding to the above comment) Thread.currentThread().setUncaughtExceptionHandler(); Way more straightforward than AOP, but also less chances to catch the Exception. – biasedbit Oct 06 '10 at 10:32
  • @StephenC, @brunodecarvalho - I'm a big fan of UEHs, but isn't the problem here that some piece of code DID catch (and consume & suppress) the Throwable? – RonU Oct 06 '10 at 15:34
  • @RonU I mentioned that fact. AOP's chances to solve this are way higher than UEH but it's a complete overkill. I'd still go with it over having some other thread or whatever periodically doing stuff in the heap (trying to raise OOME to shutdown the JVM). – biasedbit Oct 06 '10 at 16:15
  • @Rob - yes it is. But if you read all the comments, you will see that it was Michael Neale (the OP) who brought up the problem of OOME's getting lost. – Stephen C Oct 07 '10 at 02:44
  • On re-reading, this response looks like it wasn't based on reading the question at all. The first paragraph is clearly wrong based on many other answer, and the last states exactly what the question is asking about. I don't understand (perhaps I edited the question since and forgot about it). – Michael Neale Dec 02 '15 at 05:15
  • It might look like it. But actually I did read and fully understand the question. *"The first paragraph is clearly wrong based on many other answers ..."* except that most of those suggestions don't actually address the problem of wanton OOME >>catching<<. The only ones that do (really) either pull the plug on the JVM when there is an OOME, or *before* an OOME is thrown. – Stephen C Dec 02 '15 at 06:37
  • *" ... the last states exactly what the question is asking about."* - You are taking it out of context. The context starts *"For those who don't already know ..."*. – Stephen C Dec 02 '15 at 06:42
2
  1. edit OutOfMemoryError.java, add System.exit() in its constructors.

  2. compile it. (interestingly javac doesn't care it's in package java.lang)

  3. add the class into JRE rt.jar

  4. now jvm will use this new class. (evil laughs)

This is a possibility you might want to be aware of. Whether it's a good idea, or even legal, is another question.

irreputable
  • 44,725
  • 9
  • 65
  • 93
  • 2
    That's legal, but it is not something you would **ever** want to do in production code. – Stephen C Oct 07 '10 at 02:48
  • 1
    Re - *"interestingly javac doesn't care it's in package java.lang"*. Well how else would you compile Java code in `java.lang`? Yes, there are security checks to stop a normal application JAR (etc) from replacing "java.lang.*" etc classes, but these have to be enforced by the class loader / security manager. (If you relied on the Java compiler to do the enforcement, it would be relatively simple to subvert.) – Stephen C Oct 07 '10 at 02:50
1

One more thing I could think of (although I do not know how to implement it) would be to run your app in some kind of debugger. I noticed, that my debugger can stop the execution when an exception is thrown. :-)

So may be one could implement some kind of execution environment to achieve that.

DerMike
  • 15,594
  • 13
  • 50
  • 63
  • You don't want to run production code in a debugger. For a start, I believe that you take a significant performance hit by doing that. – Stephen C Dec 02 '15 at 06:36
1

User @dennie posted a comment which should really be its own answer. Newer JVM features make this easy, specifically

-XX:+ExitOnOutOfMemoryError

to exit on OOME, or to crash:

-XX:+CrashOnOutOfMemoryError

Since Java 8u92 https://www.oracle.com/java/technologies/javase/8u92-relnotes.html

A248
  • 690
  • 7
  • 17
0

How about catching OOME yourself in your code and System.exit()?

Adam Schmideg
  • 10,590
  • 10
  • 53
  • 83
  • 2
    A deeper catch would still catch it first, and the only difference with not catching is that now you'll quit the whole program instead of just the thread. – Bart van Heukelom Oct 06 '10 at 10:01
0

You can run your java program using Java Service Wrapper with an OutOfMemory Detection Filter. However, this assumes that the "bad people" are nice enough to log the error :)

dogbane
  • 266,786
  • 75
  • 396
  • 414
0

One possibility, which I would love to be talked out of, is have a stupid thread thats job is to do something on the heap. Should it receive OOME - then it exits the whole JVM.

Please tell me this isn't sensible.

Michael Neale
  • 19,248
  • 19
  • 77
  • 109
  • I'd focus on avoiding OOME at all; there's no elegant solution for this problem since it's a consequence of bad practices put to use. Either a) remove the try-catches (assumes access to source) b) use alternate libs c) report the problem and submit a patch d) write your own lib. – biasedbit Oct 06 '10 at 10:49
  • Yes - I am working on the OOME of course - the current issue is solved. But its a general principle - I don't want my app running in a useless state due to erroneous error handling, or prolonging the death of the JVM. – Michael Neale Oct 06 '10 at 21:43
0

You could use the MemoryPoolMXBean to be notified when a program exceeds a set heap allocation threshold.

I haven't used it myself but it should be possible to shut down this way when the remaining memory gets low by setting an allocation threshold and calling System.exit() when you receive the notification.

josefx
  • 15,506
  • 6
  • 38
  • 63
-1

Only thing I can think of is using AOP to wrap every single method (beware to rule out java.*) with a try-catch for OOME and if so, log something and call System.exit() in the catch block.

Not a solution I'd call elegant, though...

biasedbit
  • 2,860
  • 4
  • 30
  • 47
  • Sounds painful ;) Also, there's no way to tell the bits of code that have good OOME catchers and those that have bad ones.. – Chris Dennett Oct 06 '10 at 09:55
  • It doesn't add that much overhead. Problem is if the try-catch is in the middle of some obscure method, there's nothing you can do. If, however, the exception occurs at some lower level of the library/app and some higher method catches it, AOP would work here. Like I said, can't think of anything else that works for this case :) – biasedbit Oct 06 '10 at 10:00
  • "... nothing you can do." Well, actually there might be... if you know the exact method where this happens, again AOP to the rescue: configure a pointcut to completely divert the flow of calling that method (most of the times this is impossible though, especially on instance methods that require context or use instance vars). – biasedbit Oct 06 '10 at 10:02
  • This would indeed be painful, but that's mainly because it's not quite the right AOP solution. You can write pointcuts to intercept catch-blocks, meaning your aspect only has to wrap catch(Throwable) and catch(OOME). --- But regardless, I think the ROI for this kind of effort is poor. – RonU Oct 06 '10 at 15:39