2

I have a Rails app that is uploading data to a Node.js endpoint. The endpoint works for smaller data sets, but at a certain size begins to time out consistently with a 504 error. After the 504 error appears, the Nodejs logs show the endpoint getting hit and a few minutes later I see the image created from the uploaded data appear in S3. I know it sounds off the bat like I should just upload the data to S3 from the Rails app, but let's just take as a given that's a no go.

Looking at the Nginx access logs for the Node.js application I see a 499 being returned for the endpoint.

The 499 means the client closed the connection based on this https://httpstatuses.com/499

This person had a similar issue: NginX issues HTTP 499 error after 60 seconds despite config. (PHP and AWS)

However, I've already implemented their solution (increase the idle timeout for the associated ELB on AWS). No dice, despite confirmation here (How to figure out Nginx status code 499) that the ELB is the likely 'client' closing the connection to the Node.js application.

The ELB logs show a 504 error with the following values

request_processing_time = -1

backend_processing_time = -1

response_processing_time = -1

elb_status_code = 504

backend_status_code = 0

received_bytes = 2167746

sent_bytes = 0

I'm not 100% sure how to interpret this as clearly it received all the bytes and sent them on to the Node.js server (or else my image would never have been created). I'm just assuming sent_bytes refers to bytes sent as a response to the request. In which case, 0 makes perfect sense since it timed out.

This post outlined some potential interpretations https://hardwarehacks.org/blogs/devops/2015/12/29/1451416941200.html

Based on this article, it seems that the log line I'm seeing indicates, "The application did not respond to the ELB at all, instead closing its connection when data was requested. This is a fast timeout — the 504 will typically be returned in a matter of milliseconds, well under the ELB's timeout setting."

That doesn't sound correct, since it takes about 2 minutes to timeout. Since that's the default nodejs timeout, I thought perhaps I hadn't effectively lengthened it, but it's set correctly to significantly longer for debugging purposes (10 minutes) which I confirmed with

time telnet localhost 3500

I'm using Puma as my Rails server and Puma does not include a request timeout mechanism; even if it did, there are no errors being logged. I thought maybe it was the Httparty timeout, but then I wouldn't be getting a response with a 504 instead it would be giving me a Net Timeout error.

My nginx config is based on https://www.digitalocean.com/community/tutorials/how-to-set-up-a-node-js-application-for-production-on-ubuntu-14-04

server {
    listen 80;

    location / {
        proxy_pass http://127.0.0.1:3500;
        proxy_http_version 1.1;
        proxy_set_header   Upgrade $http_upgrade;
        proxy_set_header   Connection '';
        proxy_set_header Host $host;
        client_max_body_size 20m;
        client_body_timeout 600s;
        keepalive_timeout 600s;
        send_timeout 600s;

        proxy_connect_timeout      600s;
        proxy_send_timeout         600s;
        proxy_read_timeout         600s;
    }
}

This isn't my area of expertise, and I'm out of ideas. I'd love some direction, ideas, clarifications, etc. on how to make these larger data files successfully go through without timing out. Hopefully there's something super obvious I overlooked.

Community
  • 1
  • 1
Morgan
  • 1,438
  • 3
  • 17
  • 32
  • I saw a weird error post somewhere once where they were running out of temp staging space in nginx for uploads or something freaky, FWIW... – rogerdpack Nov 18 '19 at 23:32

0 Answers0