I have two Rails app (production and staging environment) in a remote server.
I am currently experiencing a strange problem where Puma would sometimes give me timeout after I finished deployment (via cap deploy). This has been happening for quite some time now and it's getting more frequent. Whenever this happens, I need to restart Puma server (either from cap puma:stop
and cap puma:start
), or manually do kill -9 <pid of puma instance>
. However, in both cases I need to firstly rm puma.sock
from shared/tmp/sockets
directory.
On the other hand, my production environment did not experience this issue. The difference between them is just # of commits, my staging environment is several (~50) commits ahead. Earlier when I merged staging to production and deployed, the same problem appears in production. So I rolled back my production to previous revision, restarted Puma, and the problem went away.
Note: cap puma:restart
somehow does not solve this; I have to kill current Puma instance, and start a new one in order to make this problem go away.
My current setup is:
- Rails 4.1
- Puma
- Nginx
- Capistrano 3
On the time the error occurred, nothing logged into Rails log, but Nginx logs some error:
upstream timed out (110: Connection timed out) while reading response header from upstream
after waiting for 60 seconds, page for 500 is shown.recv() failed (104: Connection reset by peer) while reading response header from upstream
page for 500 shown instantly.connect() to unix:/var/deploy/medictrust-staging/shared/tmp/sockets/puma.sock failed (111: Connection refused) while connecting to upstream
page for 500 shown instantly.
The errors above happen randomly; sometimes it's connection timed out, sometimes it's connection refused.. But the most frequent one is the connection timed out.
Strange thing is, Puma is not timing out if I access my application via cURL. There was no changes made within Puma or Nginx config, so is it possible that this is caused by application code?
How do I make this problem go away for good?