1

In Google App Engine, running on Java 8 we are seeing frontend requests to servlets fail with:

java.lang.OutOfMemoryError: GC overhead limit exceeded

But the stack trace for that error is unrelated to the servlet - instead it's throwing from a background deferred task's run() method.

Those tasks get executed by the deferred task servlet usually, and certainly not by the servlet where we're seeing the 500 error.

I'm scratching my head on this one and wondering, if a particular instance was serving both a deferred task, and that front end servlet request, and it crashed with that GC error above, could it take out the servlet thread (and presumably other threads on the same instance) so that they all fail with the same memory exception?

So my question is: Would the isolation of threads on an instance Google App Engine Java 8 environment allow for a memory error in one thread to crash all threads in that same instance, and if so would the strack trace in all threads be the same?

Update: Looking at the logs for the specific instance ID where this happened around the same timing, I see a massive number of other examples of the same stack trace, all in different users, and different frontend servlets, so it seems to support the theory that the whole instance was broken and the stack trace we're seeing is somehow crossing over from a different thread on the same instance.

Ashley Schroder
  • 3,826
  • 1
  • 21
  • 16
  • In short: heap memory is defined to be shared memory. If the heap memory is exhausted, it’s exhausted for all threads. There is no “guilty thread”. Once there is not enough memory, every thread attempting an allocation may fail with an individual `OutOfMemoryError`. Unless there’s not even enough memory to construct the `OutOfMemoryError` instances. See [Where is the OutOfMemoryError object created in Java](https://stackoverflow.com/q/46293129/2711488). But it’s more important to understand that not “a memory error in one thread” makes the others fail, it’s the absence of free memory. – Holger Apr 30 '19 at 07:36

1 Answers1

0

Something similar happened to me on app-engine-python

I was going back and forth with GCP support and apparently the culprit was a memory leak in one of their libraries which would knock out my whole instance and every request that was running at that time all died together.

So yes if the instance is crashing, all processes will die.

Alex
  • 5,141
  • 12
  • 26