Major edit: Since originally finding this issue I have whittled it down to the below. I think this is now a marginally more precise description of the problem. Comments on the OP may therefore not correlate entirely.
Edit lightly modified version posted in rails/puma projects: https://github.com/rails/rails/issues/21209, https://github.com/puma/puma/issues/758
Edit Now reproduced with OS X and Rainbows
Summary: When using Puma and running long-running connections I am consistently receiving errors related to ActiveRecord connections crossing threads. This manifests itself in message like message type 0x## arrived from server while idle
and a locked (crashed) server.
The set up:
- Ubuntu 15 / OSX Yosemite
- PostgreSQL (9.4) / MySQL (
mysqld 5.6.25-0ubuntu0.15.04.1
) - Ruby - MRI
2.2.2p95 (2015-04-13 revision 50295) [x86_64-linux]
/ Rubiniusrbx-2.5.8
- Rails (
4.2.3
,4.2.1
) - Puma (
2.12.2
,2.11
) - pg (
pg-0.18.2
) / mysql2
Note, not all combinations of the above versions have been tried. First listed version is what I'm currently testing against.
rails new issue-test
- Add a route
get 'events' => 'streaming#events'
- Add a controller
streaming_controller.rb
- Set up database stuff (
pool: 2
, but seen with different pool sizes)
Code:
class StreamingController < ApplicationController
include ActionController::Live
def events
begin
response.headers["Content-Type"] = "text/event-stream"
sse = SSE.new(response.stream)
sse.write( {:data => 'starting'} , {:event => :version_heartbeat})
ActiveRecord::Base.connection_pool.release_connection
while true do
ActiveRecord::Base.connection_pool.with_connection do |conn|
ActiveRecord::Base.connection.query_cache.clear
logger.info 'START'
conn.execute 'SELECT pg_sleep(3)'
logger.info 'FINISH'
sse.write( {:data => 'continuing'}, {:event => :version_heartbeat})
sleep 0.5
end
end
rescue IOError
rescue ClientDisconnected
ensure
logger.info 'Ensuring event stream is closed'
sse.close
end
render nothing: true
end
end
Puma configuration:
workers 1
threads 2, 2
#...
bind "tcp://0.0.0.0:9292"
#...
activate_control_app
on_worker_boot do
require "active_record"
ActiveRecord::Base.connection.disconnect! rescue ActiveRecord::ConnectionNotEstablished
ActiveRecord::Base.establish_connection(YAML.load_file("#{app_dir}/config/database.yml")[rails_env])
end
- Run the server
puma -e production -C path/to/puma/config/production.rb
Test script:
#!/bin/bash
timeout 30 curl -vS http://0.0.0.0/events &
timeout 5 curl -vS http://0.0.0.0/events &
timeout 30 curl -vS http://0.0.0.0/events
This reasonably consistently results in a complete lock of the application server (in PostgreSQL, see notes). The scary message comes from libpq
:
message type 0x44 arrived from server while idle
message type 0x43 arrived from server while idle
message type 0x5a arrived from server while idle
message type 0x54 arrived from server while idle
In the 'real-world' I have quite a few extra elements and the issue presents itself at random. My research indicates that this message comes from libpq
and is subtext for 'communication problem, possibly using connection in different threads'. Finally, while writing this up, I had the server lock up without a single message in any log.
So, the question(s):
- Is the pattern I'm following not legal in some way? What have I mis[sed|understood]?
- What is the 'standard' for working with database connections here that should avoid these problems?
- Can you see a way to reliably reproduce this?
or
- What is the underlying issue here and how can I solve it?
MySQL
If running MySQL, the message is a bit different, and the application recovers (though I'm not sure if it is then in some undefined state):
F, [2015-07-30T14:12:07.078215 #15606] FATAL -- :
ActiveRecord::StatementInvalid (Mysql2::Error: This connection is in use by: #<Thread:0x007f563b2faa88@/home/dev/.rbenv/versions/2.2.2/lib/ruby/gems/2.2.0/gems/actionpack-4.2.3/lib/action_controller/metal/live.rb:269 sleep>: SELECT `tasks`.* FROM `tasks` ORDER BY `tasks`.`id` ASC LIMIT 1):