3

I have multiple databases, with the same tables and columns names inside (but different unique ids and rows..). Instead of having one huge database with all the rows, it is splitted into different databases. This is something I cannot change (you can think of this as collecting the same data from different countries, but each country has its own database). These databases are "read only" - meaning, when I use them via Rails, it is only to display the data (or save it on a local database) - I do not change the data on any of the remote databases.

My problem is, that I need to have 1 model in rails, that collects the data from all these databases. We want to be able to do something like:

OneModelAllDB.select_where(...)

and not to split each search to:

data1 = FirstDBModel.select_where(same_condition)
data2 = SecondDBModel.select_where(same_condition)
...
data = data1 + data2 + ...

Also, if I want to make 1 model with threads (parallel searches), there is a problem:

[:db1, :db2].each do |db|
  threads << Thread.new do
    self.establish_connection(db)
    results[db] = self.where(bla_bla_condition)
  end
end

because the modification of the connection is not threadsafe...

Is there any way to do so? As we do not change any of these databases, and each row has a unique id, there shouldn't be any problem to get the data from different databases and join it together...

Thanks!

Guy
  • 137
  • 9
  • If you want it to act like one big db without actually changing the structure, you'll need to write your own multi-db access methods in your models that do what I showed in my answer. (The model can only access one actual database at a time.) Then elsewhere in the code you can reference the mew methods. – lurker Aug 02 '15 at 15:12

1 Answers1

2

Rails active records are abstracted from the database connection. The connection is managed separately. You won't have something like, FirstDB.select_where(...).

However, what you can try is define your different databases in your config/database.yml, for example:

db1:
  adapter: postgresql
  encoding: unicode
  database: database1
  pool: 5
  username: xxx
  password: xxx

db2:
  adapter: postgresql
  encoding: unicode
  database: database2
  pool: 5
  username: xxx
  password: xxx

Then in your code, you would use ActiveRecord::Base.establish_connection to re-establish the connection to the database you want before running the query. If your active record model is called Foo, then:

Foo.establish_connection(:db1)
data1 = Foo.where(...)

Foo.establish_connection(:db2)
data2 = Foo.where(...)

I have not tried this in detail myself, but it should work something like that.

lurker
  • 56,987
  • 9
  • 69
  • 103
  • First - thanks for the answer. When I wrote `FirstDB.select_where(same_condition)` I meant for first model with `establish_connection(:db1)`, and second model with `:db2` etc.. The thing is I want 1 model to include them all (all the databases), and to be able to do this in different threads (establish_connection is not thread safe) – Guy Aug 02 '15 at 15:39
  • @Guy the answer I am showing is using the *same model*, `Foo`. But if you want one call to the model to do multiple db accesses, you need to write your own class methods for the model that does the different db connections as shown. – lurker Aug 02 '15 at 15:45
  • @Guy if your databases are all the same type (*e.g.*, they're all MySQL), you could write your custom `Foo.multi_select` using [multi-database queries utilizing `UNION`](http://stackoverflow.com/questions/2132654/querying-multiple-databases-at-once). You would still need to have proper authentication to all databases you're accessing, of course. – lurker Aug 02 '15 at 17:07
  • But if I want to do those with different threads (as added in the original post)? I was told this does not work using the `establish_connection`, as it is not threadsafe.. Other than that - your solution is exactly what I was thinking and is great! Thanks! – Guy Aug 02 '15 at 20:29
  • @Guy ah, I see. I'm not familiar with the "thread-safeness", or lack thereof, of `establish_connection`. Perhaps you could use the `UNION` technique. Either way, there's a bit of work to hunt down the AR methods that you want to either override or recast (with new names) to utilize that technique. – lurker Aug 02 '15 at 23:39