0

I'm looking for a way to perform a search against multiple models (see this post), and got a couple of answers saying that Thinking Sphinx would be a good match for this kind of thing.

Indeed, it looks sweet, and it seems the application-wide search capability (ThinkingSphinx.search) is close to what I want. But the docs state this will return various kinds of model objects, depending on where a match was found.

I have a models somewhat like this:

  • Employee
  • Company
  • Municipality
  • County

Where employees are linked to County only though Company, which in turn is linked to a Municipality, which in turn is linked to the actual County.

Now as a result from my search, I really only want Employee objects. For example a search for the string "joe tulsa" should return all Employees where both words can be found somewhere in the named models. I'll get some false positives, but at least I should get every employee named "Joe" in Tulsa county.

Is this something that can be achieved with built in functionality of Thinking Sphinx?

Community
  • 1
  • 1
rogerkk
  • 5,494
  • 5
  • 37
  • 53

1 Answers1

3

I think what you should do in this case is define associations for your Employee model (which you probably have already), e.g.:

class Employee < ActiveRecord::Base
  ...
  belongs_to :company
  has_one :municipality, :through => :company
  has_one :county, :through => :company
  ...
end

class Company < ActiveRecord::Base
  ...
  belongs_to :municipality
  has_many :employees
  has_one :county, :through => :municipality
  ...
end

class Municipality < ActiveRecord::Base
  ...
  belongs_to :county
  has_many :companies
  ...
end

class County < ActiveRecord::Base
  ...
  has_many :municipalities
  ...
end

Edit: I tested the multi-level has_one relationship, and it doesn't work like that. Seems to be fairly complex to model these 4 layers without denormalizing. I'll update if I come up with something. In any case, if you denormalize (i.e. add redundant foreign IDs to all models to your employees table), the associations are straightforward and you massively increase your index generation time. At the same time, it may involve more work to insure consistency.

Once the associations are set up, you can define the Sphinx index in your Employee model, like this:

define_index do
  indexes :name, :sortable => :true
  indexes company(:name),      :as => :company
  indexes municipality(:name), :as => :municipality
  indexes county(:name),       :as => :county
  ...
end

That way the columns in your associations are indexed as well, and Sphinx will automatically join all those tables together when building the index.

M. Cypher
  • 6,966
  • 2
  • 34
  • 34
  • Ahh, sweet, I will check this out. Thanks! I see nested has_many is to be supported in Rails 3.1, btw. – rogerkk Aug 12 '11 at 12:58
  • Ah, guess I can live with denormalizing the db, but should you ever come a cross a cleaner solution, do tell. :) Rails still makes it pretty easy to ensure consistency - I just added an after_save hook to update the values. – rogerkk Aug 13 '11 at 00:44
  • It's worth noting that you can drill through associations - so you don't need to add the `has_one :through` if Sphinx is the only reason for it - in your index, the following will work: `indexes company.municipality.county.name, :as => :county`. – pat Aug 14 '11 at 02:11