I have 2 service. One is hosted in Google App Engine and one is hosted in Cloud Run.
I use urlfetch (Python 2) imported from google.appengine.api in GAE to call APIs provided by the Cloud Run.
Occasionally there are a few (like <10 per week) DeadlineExceededError shown up like this:
Deadline exceeded while waiting for HTTP response from URL
But these few days such error suddenly occurs frequently (like ~40 per day). Not sure if it is due to Christmas peak hour or what.
I've checked Load Balancer logs of Cloud Run and turned out the request has never reached the Load Balancer.
Has anyone encountered similar issue before? Is anything wrong with GAE urlfetch?
I found a conversion which is similar but the suggestion was to handle the error...
Wonder what can I do to mitigate the issue. Many thanks.
Update 1
Checked again, found some requests from App Engine did show up in Cloud Run Load Balancer logs but the time is weird:
e.g.
Logs from GAE project
10:36:24.706 send request
10:36:29.648 deadline exceeded
Logs from Cloud Run project
10:36:35.742 reached load balancer
10:36:49.289 finished processing
Not sure why it took so long for the request to reach the Load Balancer...
Update 2
I am using GAE Standard located in US with the following settings:
runtime: python27
api_version: 1
threadsafe: true
automatic_scaling:
max_pending_latency: 5s
inbound_services:
- warmup
- channel_presence
builtins:
- appstats: on
- remote_api: on
- deferred: on
...
The Cloud Run hosted API gateway I was trying to call is located in Asia. In front of it there is a Google Load Balancer whose type is HTTP(S) (classic)
.
Update 3
I wrote a simple script to directly call Cloud Run endpoint using axios (whose timeout is set to 5s) periodically. After a while some requests were timed out. I checked the logs in my Cloud Run project, 2 different phenomena were found:
For request A, pretty much like what I mentioned in Update 1, logs were found for both Load Balancer and Cloud Run revision.
Time of CR revision log - Time of LB log > 5s
so I think this is an acceptable time out.
But for request B, no logs were found at all.
So I guess the problem is not about urlfetch nor GAE?