Graceful degradation in Java to avoid out of memory errors

Question

What tools or best practices are available for gracefully degrading service in a Java service during bursts of memory-intensive requests? The application in question is multi-threaded. The amount of work required to handle each request can vary greatly and would not be easy to split up and parallelize.

I’m wary of writing application level code that concerns itself with heap usage and GC but we find the application can get itself in to trouble, meaning out of memory errors or full GC, by taking on more than one intensive request. Often a full GC is not able to find any memory to free.

Long story short: I am thinking of adding some throttling or queuing capabilities to pre-empt this kind of problem.

Any ideas or advice appreciated.

I assume you have profiled your application to ensure you cannot reduce the amount of memory used. Also you have checked you cannot get a bigger server, a 16 GB PC can cost you $1000. — Peter Lawrey, Jun 22 '11 at 08:06
Peter - Yes, we have profiled the application to see where we can reduce memory usage. More physical memory, bigger server - might these options not cause the GCs to be longer and more painful when they do happen? — ChrisW, Jun 22 '11 at 14:01
The cost of GC is proportional to the amount used and the inverse of the amount free. If you use the same amount of memory the GC pause will be the same, but far less often. If you use more memory it will be longer, but that is better than failing outright. — Peter Lawrey, Jun 22 '11 at 14:06

score 1 · Answer 1 · edited May 23 '17 at 10:08

1

As joeslice said, implement throttling via a simple resource pool. At the most basic level, this is a semaphore -- your worker threads need to acquire a permit before they process requests. Since you say you have heterogeneous tasks, you probably want the permits to be a little more complex, e.g. acquire some number of permits proportional to the size of the work.

In the past, I've found that this doesn't always work. Let's say your heuristics are off and your app throws an OOM anyway. It's important to prevent the process from hanging around in a bad state, so kill and restart the process immediately. There are a few ways to notice when an OOM happens, e.g. see java out of memory then exit.

edited May 23 '17 at 10:08

Community

1
1

answered Jun 22 '11 at 02:31

jtoberon

8,706
1
35
48

The process will not necessary hang in a bad state. OOM will unroll the stack of the offending thread, releasing some memory. At the thread pool OOM can be caught and heuristics adjusted somewhat to take into account the changed request pattern. The app as a whole can survive and continue to serve. – Vladimir Dyuzhev Jun 22 '11 at 02:41
1

OOM can have various causes, only one of which is controlled by the semaphore/heuristics, so I don't think what you've described is reliable. In other words, catching OOM may work sometimes, but I've found it much better to be safe than sorry. Keep GC and application logs so that you can reproduce the problem later -- outside of your production environment! – jtoberon Jun 22 '11 at 02:52

score 1 · Answer 2 · answered Jun 22 '11 at 03:29

Here is an implementation example by the authors of Netty (link). They basically keep track of the memory usage and directly throttle based on that statistic.

Another, more crude way of doing this is to limit concurrent execution by using a fixed thread pool and a bounded queue. A common way is to let the caller of queue.put() execute the task itself once this queue is full. This way, the load will (well, is supposed to) propagate all the way back to the client until the creation of new requests becomes slower. Hence the behavior of the app. becomes more "graceful".

In practice, I almost only use the "crude" way described above. It works pretty well. Basically a combination of fixed thread pool and bounded queue + Caller runs rejection policy. I keep the parameters (queue size, thread pool size) configurable, and then after the design is done, I'll tune these parameters. Sometimes it becomes apparent that a thread pool can be shared among service etc., so in that case it is really handy to use the class ThreadPoolExecutor to get fixed thread pool/bounded queue/caller runs policy all wrapped in one.

score 0 · Answer 3 · answered Jun 22 '11 at 02:14

I wonder if there is a way to predetermine approximately how much memory you will use for a given job.... If there was some way to determine that a particular input is likely to yield an explosive memory sizing, perhaps you can try to keep it from being run at the same moment as another high usage job.

If you could determine the relative size from job to job (that's a big assumption), you could allow (say) 100 units of work to be run at once using a counting Semaphore. A typical job might only count as one unit (and aquire just one permit), where a larger job may need to aquire 10 or 20 permits before running....

Of course, if you can't predetermine anything about the size of the to-be-consumed memory, you might still be able to explore ways to further subdivide your problem so that you are doing a larger number of small-memory jobs instead of a small number of big jobs.

Vladimir Dyuzhev · Answer 4 · 2011-06-22T02:25:16.880

In application servers there are usually settings for workers thread pool. Maximum thread number in that pool roughly defines how much memory you would consume. This is a simple and, importantly, working concept.

I would not call it "graceful degradation" though. This is throttling. Graceful degradation involves reducing the service level (e.g. amount of details provided to user) to keep at least the basic necessary functions available to every current user. With throttling extra users are just out of luck.

Graceful degradation by that definition requires the knowledge of the application nature, and therefore you have to make the code know about it.

The obvious approach is to divide all possible operations into classes by their necessity for a user. 1st class shall always be handled. 2nd (3rd, 4th, ...) class shall be served only if server is below specific load level, otherwise return "temporarily unavailable" error.

score 0 · Answer 5 · answered Jun 22 '11 at 03:29

0

Are you using J2EE? Because this is the job of the Application Server to do load balancing, and I'm sure many mainstream AppServers support it. Your application should not be concerned about it.

answered Jun 22 '11 at 03:29

Denis Tulskiy

19,012
6
50
68

Graceful degradation in Java to avoid out of memory errors

5 Answers5