3

I am getting 504 Gateway timeout error from my GET method call to another service. Recently I added a fix by increasing the timeout period but that didn't help.

This is what I have tried

   public void getUserInformation(final Integer userId) {
        HttpClient httpClient = getBasicAuthDefaultHttpClient();
        HttpGet httpGet = new HttpGet("http://xxxx/users/"+userId);
        httpGet.addHeader("userid", userid);
        httpGet.addHeader("secret", secret);
        try {
            HttpResponse response = httpClient.execute(httpGet);
            HttpEntity entity = response.getEntity();

            if (entity != null && HttpStatus.OK.value() == 
               response.getStatusLine().getStatusCode()) {
               ObjectMapper objectMapper = new ObjectMapper();
               userInfo = objectMapper.readValue(entity.getContent(), 
               UserInfo.class);
            } else {
                logger.error("Call to the service failed: response code: 
                {}", response.getStatusLine().getStatusCode());
            }
        } catch (Exception e) {
            logger.error("Exception: "+ e);
        }

   }

  public HttpClient getBasicAuthDefaultHttpClient() {
    CredentialsProvider provider = new BasicCredentialsProvider();
    UsernamePasswordCredentials creds = new 
    UsernamePasswordCredentials(user, password);
    provider.setCredentials(AuthScope.ANY, creds);

    //Fix to avoid HTTP 504 ERROR (GATEWAY TIME OUT ERROR) for ECM calls
    RequestConfig.Builder requestBuilder = RequestConfig.custom();
    requestBuilder.setConnectTimeout(30 * 1000);
    requestBuilder.setConnectionRequestTimeout(30 * 1000);

    HttpClientBuilder builder = HttpClientBuilder.create();
    builder.setDefaultRequestConfig(requestBuilder.build());
    builder.setDefaultCredentialsProvider(provider).build();

    return builder.build();
  }

I am calling this process within a loop to process records, this works for most of the records but fails for few userId's in that. But what I noticed is everything will work fine when I run only the failed records, not sure whats the problem in this case.

I thought of calling the method again when I receive 504 to invoke it again hoping to receive 200 next time.

Not sure is this the good idea. Any advice would be greatly appreciated.

user3919727
  • 283
  • 2
  • 7
  • 25
  • 1
    A thought that comes to mind... Is the timeout different every time? My guess is that the other website is limiting the number of requests that they will handle from a particular ip address. Perhaps you could try rate limiting the number of requests that you make to the website. Perhaps even a half second delay is enough... – hooknc Oct 29 '19 at 21:08
  • @hooknc: Timeout is not different every time and there is not any limits on incoming requests. For those failed requests what I saw in the log is it took only 10 secs and returned 504 error! – user3919727 Oct 30 '19 at 17:29
  • Make sure proxy URL is correct and there's not any spelling mistake. Sometimes silly mistakes like this can give you hours of headache. – Chinmay Raikwar Apr 27 '21 at 10:52

2 Answers2

3

According to the description of the 504 Gateway Timeout status code, it is returned when you have a chain of servers that communicate to process the request and one of the nodes (not the server you are calling but some later one) is not able to process the request in a timely fashion.

I would presume that the situation you are in could be depicted as follows.

CLIENT -> USERS SERVICE -> SOME OTHER SERVICE

The problem is that SOME OTHER SERVICE is taking too long to process your request. The USERS SERVICE gives up at some point in time and returns you this specific status code to indicate that.

As far as I know, there is little you could do to mitigate the problem. You need to get in touch with the owners of the USERS SERVICE and ask them to increase their timeout or the owners of SOME OTHER SERVICE and ask them to improve their performance.

As for why such an error could occur from time to time. It is possible that you, in combination with other clients, are transitively overloading SOME OTHER SERVICE, causing it to process requests slower and slower. Or it could be that SOME OTHER SERVICE has throttling or rate limiting enabled to prevent Denial of Service attacks. By making too many requests to the USERS SERVICE it is possible that you are consuming the quota it has.

Of course, all of these are speculations, without knowing you actual scenario.

Momchil Atanasov
  • 479
  • 4
  • 10
0

I faced the same sometime back, below are the checks i did to resolve this. I will add more details to the above analogy.

Client-> Users Service -> Some Other Service

Client checks:

'Some Other Service' checks: If throttling/rate limiting is set to avoid DOS attacks. Then you need to increase the timeouts on Some Other Service. I used tomcat server on AWS: Changed the idle timeout in your yaml file

metadata:
    annotations:
        #below for openshift which worked for me        
        haproxy.router.openshift.io/timeout:20000
        #below for kubernetes timeout in ELB
        service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout:20000 

Also changed the connector timeout on tomcat

 <Connector connectionTimeout="20000" port="8080" protocol="HTTP/1.1" redirectPort="8443"/>

Voila! It worked for me.