How to fix CloudRun error 'The request was aborted because there was no available instance'

Question

I'm using managed CloudRun to deploy a container with concurrency=1. Once deployed, I'm firing four long-running requests in parallel. Most of the time, all works fine -- But occasionally, I'm facing 500's from one of the nodes within a few seconds; logs only provide the error message provided in the subject.

Using retry with exponential back-off did not improve the situation; the retries also end up with 500s. StackDriver logs also do not provide further information.

Potentially relevant gcloud beta run deploy arguments:

--memory 2Gi --concurrency 1 --timeout 8m --platform managed

What does the error message mean exactly -- and how can I solve the issue?

Yes, us-central1 -- as it's still the only choice (for me?) when trying to create a new service through console.cloud.google.com / UI; CLI offered more choices long ago, but it always resulted in errors for me, making me believe it's really only available there? — Jan Hacker, Jul 12 '19 at 13:50
The UI only offers central, but the CLI let's you use others as well. We tried east with success (but it doesn't show up in UI) — Pentium10, Jul 12 '19 at 13:54
Many new region are now available : https://cloud.google.com/run/docs/release-notes#july_10_2019 — guillaume blaquiere, Jul 15 '19 at 20:40

score 18 · Accepted Answer · answered Jul 12 '19 at 20:51

18

This error message can appear when the infrastructure didn't scale fast enough to catch up with the traffic spike. Infrastructure only keeps a request in the queue for a certain amount of time (about 10s) then aborts it.

This usually happens when:

traffic suddenly largely increase
cold start time is long
request time is long

answered Jul 12 '19 at 20:51

Steren

7,311
3
31
51

37

Can you improve this answer, with the way how to fix this error. Not just why it happens. – Pentium10 Jul 15 '19 at 07:21
There's performance tips in the [docs](https://cloud.google.com/run/docs/tips#optimizing_performance) that could help with this – Corinne White Jul 16 '19 at 15:45
1

Accepting the answer as helpful despite the room for improvement. Haven't seen the error for a few days now... Should it reoccur, I'll try to add preliminary warm-up requests. IMHO, request time being long should not lead to this error (given I have specified a relatively long timeout). – Jan Hacker Aug 01 '19 at 06:23
2

This is only a half-answer: it explains "what does this error mean" but not "how can I solve the issue". [Corinne White](https://stackoverflow.com/users/11293217/corinne-white) links to docs which is helpful, but they're pretty generic. – orbiteleven Nov 23 '22 at 08:45

score 8 · Answer 2 · answered Apr 12 '21 at 07:14

We also faced this issue when traffic suddenly increased during business hours. The issue is usually caused by a sudden increase in traffic and a longer instance start time to accommodate incoming requests. One way to handle this is by keeping warm-up instances always running i.e. configuring --min-instances parameters in the cloud run deploy command. Another and recommended way is to reduce the service cold start time (which is difficult to achieve in some languages like Java and Python)

score 6 · Answer 3 · answered Jul 14 '19 at 12:48

I also experiment the problem. Easy to reproduce. I have a fibonacci container that process in 6s fibo(45). I use Hey to perform 200 requests. And I set my Cloud Run concurrency to 1.

Over 200 requests I have 8 similar errors. In my case: sudden traffic spike and long processing time. (Short cold start for me, it's in Go)

score 1 · Answer 4 · answered Nov 19 '19 at 03:35

1

I was able to resolve this on my service by raising the max autoscaling container count from 2 to 10. There really should be no reason that 2 would be even close to too low for the traffic, but I suspect something about the Cloud Run internals were tying up to 2 containers somehow.

answered Nov 19 '19 at 03:35

Charles Offenbacher

3,094
3
31
38

1

Where can you se "max autoscaling"? I cannot find any documentation on it. – anonymous-dev Nov 09 '21 at 23:00
In YAML, use `autoscaling.knative.dev/maxScale: '4'`, I couldn't find GUI knobs and suspect YAML is the design. For me, Cloud Run can spike far over its 4 max to 12! I think because my site is new/unused it scales to zero on idle, so when GoogleBot drives past it scrambles to scale up, and being .NET it takes a while to start, it overshoots. I suspect a smaller VM with `autoscaling.knative.dev/minScale: '1'` might prevent this, but I'm not sure what's cheaper, brief overscaling or always on. – Luke Puplett Dec 23 '21 at 10:50
See https://cloud.google.com/run/docs/configuring/max-instances and the sibling doc pages for all this. – Luke Puplett Dec 23 '21 at 10:53

score 0 · Answer 5 · answered Dec 18 '22 at 18:14

0

Setting the Max Retry Attempts to anything but zero will remedy this, as it did for me.

answered Dec 18 '22 at 18:14

Learn2Code

1,974
5
24
46

score 0 · Answer 6 · answered Aug 25 '23 at 19:17

This error can be caused by one of the following.

A huge sudden increase in traffic.
A long cold start time.
A long request processing time
A sudden increase in request processing time
The service reaching its maximum container instance limit (HTTP 429)

We have faced similar issue sporadically and it was due to a long request processing time when DB latencies are high for few requests.

How to fix CloudRun error 'The request was aborted because there was no available instance'

6 Answers6

Linked