7

I have created a "Load Balancer" in Google Cloud and connected 2 virtual machines to it. When I send some requests to "Load Balancer", sometimes it gets passed to virtual machines attached to load balancer and sometimes it throws following error even health check is 100% OK at that time.

Error: Server Error The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.
enter image description here

A_Sk
  • 4,532
  • 3
  • 27
  • 51
QASIM JAVED
  • 71
  • 1
  • 4
  • 3
    I'd look in the Stackdriver logs and see if there is more information about the nature of the error. – Kolban Dec 24 '19 at 16:35
  • 1
    How rapidly are you making requests to the load balancer? Can your backends (VMs) handle the number of requests? Is this error page coming from the load balancer or your backends? – John Hanley Dec 24 '19 at 19:08
  • @JohnHanley thanks for your reply, here are answers of your questions: I am sending 1 request per second. Yes backends (VMs) are of large specs (i.e. 16GB RAM) and i am testing load balancer on a simple GET call returning static response. This error page is coming from load balancer. – QASIM JAVED Dec 26 '19 at 04:46
  • Based upon the information provided, I do not know. The 500 Server Error means that the load balancer is crashing (has a fatal error processing your request). What is the request that you are making? – John Hanley Dec 26 '19 at 14:43
  • This could be caused by a myriad of factors. As the information provided is too generic I would suggest to refer to the [troubleshooting section](https://cloud.google.com/load-balancing/docs/https/troubleshooting-ext-https-lbs) of the GCP documentation as a good starting point, assuming this is an HTTP(S) LB. The exact logging error would be useful to diagnose this. – Ajordat Aug 01 '21 at 08:28
  • Since this question has been open for 2 years now, did you find a way to solve your issue? – Ismael Clemente Aguirre Jan 18 '22 at 16:48

2 Answers2

0

This answer was created to support the community based on the limited information delivered by the OP and the comments written above.

The most accurate decision to make when you try to determine the root cause of an HTTP load balancer issue is review the log entries.

According to the official google documentation. HTTP(S) Load Balancing log entries contain information useful for monitoring and debugging your HTTP(S) traffic.

Log entries contain the following types of information:

  • General information, such as severity, project ID, project number, and timestamp.
  • HttpRequest log fields. However, HttpRequest.protocol is not populated for HTTP(S) Load Balancing Cloud Logging logs.
  • A statusDetails field inside the structPayload. This field holds a string that explains why the load balancer returned the HTTP status that it did. The tables below contain further explanations of these log strings. The statusDetails field is not available for regional external HTTP(S) load balancers.
  • Redirects (HTTP response status code 302 Found) issued from the load balancer are not logged. Redirects issued from the backend instances are logged.

To enable the log entries in an HTTP Load Balancer please follow this guide.

The message “Error: Server Error The server encountered a temporary error and could not complete your request.” Could be caused for several reason reasons including:

  • There's no firewall rule configured to allow health checks.
  • The software on the backends isn't running.

In this page you can find a detailed guide to perform a complete troubleshooting related to general connectivity issues.

I found these posts related to HTTP Load balancer and 502 response, you can find useful information in these threads.

0

In my case issue was with health check not returning 200. It returned 302 instead (Found) when calling default / and redirected to other url with 200 (which Loadbalancer checks ignored) and deemed that node as "unhealthy" and instead to route incoming http/s request to broken node removed it out of rotation and returned that 502 error message to client.

Error: Server Error The server encountered a temporary error and could not complete your request. Please try again in 30 seconds.

Underneath my LoadBalancer was GKE cluster with gke ingress->service-> pod and no explicit liveness/readiness probes configured so by default healthchecks hit / with 302/Found/redirect.

After adding those probes to deployment manifest and pointing them to endpoint that return OK/200 (/-/healthy, /-/ready in my case of prometheus running inside the pod)issue was fixed.

Unfortunately gke ingress had un-informative message UNHEALTY only in annotations, so it took me a while to understand what causes that issue.

Ihor K
  • 1