We have a REST API written in Spring Boot (Java 8) and hosted on App engine Flexible environment.
Currently app.yaml
looks like:
runtime: java
api_version: '1.0'
env: flex
threadsafe: true
manual_scaling:
instances: 1
resources:
cpu: 4
memory_gb: 16
liveness_check:
path: "/healthcheck"
check_interval_sec: 30
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
initial_delay_sec: 300
We have noticed that long running (1-5min) requests for one of the endpoints usually returns 499 HTTP response code, which is not something we expected.
It looked something like this:
POST /endpoint
-> (work starts in application) -> 499
response is sent back -> (work still runs in application thread) -> Caller repeats request -> we have 2 same requests running -> repeats
To avoid that, we have moved endpoint work into background ThreadPoolTaskExecutor
and that endpoint is no longer causing any problems.
But other issue appeared - other endpoint /second
now is getting same early 499 response issue even if it never did that before (usual run-time for that endpoint is from 1 second when cached up to 3 minutes when not)
This makes me believe that App Engine scheduler/load balancer is somehow deciding on how long requests can take before they get aborted. Are we missing some kind of timeout configuration? Anyone can identify where the problem could be (tomcat, gce, spring, ...)?