4

Apologies but this will be a bit of a wishy washy question, I am out of ideas at the moment; apart from testing all the connections (all look good individually) I am not sure what else I can even change. Please tell me if there is any other information I can give.

I built an AWS stack 2 days ago with requests to a Route 53 aliased to an Application Load Balancer (for SSL only), forwarding to a single EC2 instance over HTTP:80. The Instance is an Elastic Beanstalk Docker container, proxied with NGinx internally, running tomcat:9 hosting a java web app. I had a previous stack (the same components as far as I can see) which did not exhibit the problem below. Running the docker image in a local container also does not exhibit these symptoms.

All tests are done in a browser (Chrome and Firefox so far) with no other windows open, same with browser tools opened or closed.

When POSTING to the url for the first time 'in a while' the OPTIONS request has the timing profile below:

First (OPTIONS) request

The "Initial Connection" is taking 21s! This is repeatable, every time it's the 'first time in a while'. Thereafter, we get the second profile. The POST part of the request then looks like this:

Second (POST) request

Any further requests get a good response (GET below).

Third (GET) request

Refreshing the page immediately and performing the same OPTIONS/POST gets the fast response.

Fourth (POST) request

Tobin
  • 1,698
  • 15
  • 24
  • 2
    If it's multi-AZ ALB, do you have instances in each AZ? Do ALB access logs or metrics help you identify if the latency is ALB itself or upstream from there? Similarly nginx logs? Any chance the webapp itself has huge latency on first request if it's cold? – jarmod Nov 02 '17 at 22:41
  • As an update; the logs revealed nothing helpful so far, have stripped stack for the weekend, more update later. It was multi-AZ but only with incoming on one AZ, it was never a problem before. – Tobin Nov 03 '17 at 17:13
  • 1
    Sharing ELB latency diagnostic steps (I know you're using ALB) in case it's helpful: https://aws.amazon.com/premiumsupport/knowledge-center/elb-latency-troubleshooting/. But I think you should run instances in each and every AZ that the LB targets (or revert to a single AZ config for the LB). – jarmod Nov 03 '17 at 18:02
  • 1
    I think jarmod might have the closest idea, I created another environment from scratch with that as the only difference. A further issue was identified with the SQL Server Express DB, it had very slow response times for the first few hours with high CPU usage. So whilst not a particularly helpful solution, the problem "abated". – Tobin Nov 10 '17 at 12:38
  • 1
    Did you find the root cause? – manash Apr 26 '21 at 20:13
  • I came across the same issue recently. The EC2 was in a different AZ than the ones listed on the ALB. Matching those solved the stall for me. – Khalid Ibrahim May 17 '23 at 02:06

0 Answers0