- Partial answer -
Note that I am using the OpenJDK (JDK 9) sources as a foundation to comment on this question. This answer does not relies on any kind of documentation or published specifications, and includes a bit of speculation coming from my understanding and interpretation of the source code.
The GC overhead limit exceeded
is considered in the VM as a subtype of out of memory error and generated after an attempt to allocate memory fails (see (a)).
Essentially, the VM keeps track of the number of occurences of full garbage collection and compares it against the limit enforced for full GCs (which can configured on Hotspot using -XX:GCTimeLimit=
, cf Garbage Collector Ergonomics).
The implementation of how the full GC count is tracked and the logic behind when a GC overhead limit is detected is available in one place, in hotspot/src/share/vm/gc/shared/adaptiveSizePolicy.cpp
. As you can see, two additional conditions on the memory available in the old and eden generations are required to satisfy the criteria of a GC overhead limit:
void AdaptiveSizePolicy::check_gc_overhead_limit(
size_t young_live,
size_t eden_live,
size_t max_old_gen_size,
size_t max_eden_size,
bool is_full_gc,
GCCause::Cause gc_cause,
CollectorPolicy* collector_policy) {
...
if (is_full_gc) {
if (gc_cost() > gc_cost_limit &&
free_in_old_gen < (size_t) mem_free_old_limit &&
free_in_eden < (size_t) mem_free_eden_limit) {
// Collections, on average, are taking too much time, and
// gc_cost() > gc_cost_limit
// we have too little space available after a full gc.
// total_free_limit < mem_free_limit
// where
// total_free_limit is the free space available in
// both generations
// total_mem is the total space available for allocation
// in both generations (survivor spaces are not included
// just as they are not included in eden_limit).
// mem_free_limit is a fraction of total_mem judged to be an
// acceptable amount that is still unused.
// The heap can ask for the value of this variable when deciding
// whether to thrown an OutOfMemory error.
// Note that the gc time limit test only works for the collections
// of the young gen + tenured gen and not for collections of the
// permanent gen. That is because the calculation of the space
// freed by the collection is the free space in the young gen +
// tenured gen.
// At this point the GC overhead limit is being exceeded.
inc_gc_overhead_limit_count();
if (UseGCOverheadLimit) {
if (gc_overhead_limit_count() >= AdaptiveSizePolicyGCTimeLimitThreshold){
// All conditions have been met for throwing an out-of-memory
set_gc_overhead_limit_exceeded(true);
// Avoid consecutive OOM due to the gc time limit by resetting
// the counter.
reset_gc_overhead_limit_count();
} else {
...
}
(a) When is a GC overhead limit exceeded
error generated?
It actually does not happen during a collection itself, but when the VM makes an attempt to allocate memory - you can find the justification of these statements in hotspot/src/share/vm/gc/shared/collectedHeap.inline.hpp
:
HeapWord* CollectedHeap::common_mem_allocate_noinit(KlassHandle klass, size_t size, TRAPS) {
...
bool gc_overhead_limit_was_exceeded = false;
result = Universe::heap()->mem_allocate(size, &gc_overhead_limit_was_exceeded);
...
// Failure cases
if (!gc_overhead_limit_was_exceeded) {
report_java_out_of_memory("Java heap space");
...
} else {
report_java_out_of_memory("GC overhead limit exceeded");
...
}
(b) Note about the G1 implementation
Looking at the method mem_allocate
of the G1 implementation (which can be found in g1CollectedHeap.cpp
), it appears the boolean gc_overhead_limit_was_exceeded
is not used anymore. I wouldn't be too fast to draw the conclusion that the GC memory overhead error cannot occur anymore if G1 GC is enabled though - I need to check this.
Conclusion
It seems you were right in that this error genuinely comes from memory exhaustion;
The argument that this error can be generated based on the number of times small objects are collected does not seem right to me, because
- We saw the VM does need to run out of memory for this error to occur;
- Independently of the first reason, we would need to refine the statement further anyway - and especially the reference to small objects. Are we talking about young generation collection only? If so, these collections are not included in the GC count checked against the limit, and therefore would never have a chance to be involved in this error, would the VM run OOM or not.