1

I am currently trying to write a worker that has to check the most recent child's status and queuing it based on that. For example, find parents whose oldest child is in school

class Parent
  has_many :children
end
class Child
  belongs_to :parent
  lookup_for :status, symbolize: true
end

Currently my scope is:

Parent.joins(:children).
      where("children.birth_date =
            (SELECT MAX(children.birth_date)
             FROM children
             WHERE children.parent_id = parents.parent_id").
      where('children.status = ?', Status[:in_school]').pluck(:parent_id)

Seems like there should be a better way to do this. Any ideas

To Clarify I am looking for all the Parents who's oldest child is still in school

tuxr
  • 31
  • 5
  • Surely you mean "parents.id", not :parent_id in the schema above? I mean, you could do it that way, but the near universal standard is simply "id" for the primary key. – Rob Jan 27 '16 at 19:39
  • I recommend you check this stackoverflow post: [http://stackoverflow.com/questions/1313120/retrieving-the-last-record-in-each-group](http://stackoverflow.com/questions/1313120/retrieving-the-last-record-in-each-group) – Antonio Grass Jan 27 '16 at 19:45
  • Yeah, generally I agree but in the database we are using it is actually in the form like parents.parents_id – tuxr Jan 27 '16 at 20:12

4 Answers4

0

If you are using the first/or last child a lot in your queries, I would add them as fields in the parent

class Parent
  belongs_to :youngest_child, class_name: Child, inverse_of: :parent
end
ilan berci
  • 3,883
  • 1
  • 16
  • 21
0

Well, you can push some of the logic into the joins() statement like below, but I don't see how you can get out having to have a subquery somewhere.

Parent.joins(%Q{left join children on children.parent_id = parents.id
                  and children.birth_date = (select max(birth_date) from children 
                  where children.parent_id = parents.id)}).
                where('children.status = ?', Status[:in_school]')

That would look like that. HTH

Rob
  • 4,404
  • 2
  • 32
  • 33
0

Alternatively, you can use order and last to find the last child, and then check the status of the child in-memory. There will be at most one child fetched by this approach, so performance is reasonable. Like below:

child=Child.order(:birth_date).joins(:parent).last
parent.do_stuff if child.in_school?
Rob
  • 4,404
  • 2
  • 32
  • 33
  • Yeah I have other scopes on Parent base that I need – tuxr Jan 27 '16 at 20:16
  • Oh, you need it to come in from parent? I'm sure we can do a merge and make it work if that is what you need. – Rob Jan 27 '16 at 20:33
  • Yeah it needs to come from parent because it'll be scoped more before the join which makes it less expensive only having to look at the limited number or parents that are remaining – tuxr Jan 27 '16 at 20:43
  • I suspect something like this might work, I'd need to gen up a sample and try it: class Child < ActiveRecord::Base scope :oldest_child, lambda {Child.order(:birth_date).first} end class Parent < ActiveRecord::Base has_many :children end Parent.joins(:children).merge(Child.oldest_child).where('children.status = ?', Status[:in_school]') – Rob Jan 27 '16 at 20:44
0

Your query works - it gets the job done and it is clear what you're going for. I tested your pattern on a similar association in the app I'm working on now, and ActiveRecord combined it all into one query correctly, and it took 1.4ms.

If you want to optimize performance, you could use a join to get the oldest child, instead of a subquery:

Parent.joins("INNER JOIN (
                SELECT c1.*
                FROM children c1
                LEFT JOIN children c2 ON (c1.parent_id = c2.parent_id AND c1.birthdate > c2.birthdate)
                WHERE c2.parent_id IS NULL
              ) c ON parent.parent_id = c.parent_id")
              .where('children.status = ?', Status[:in_school])
              .pluck(:parent_id)

It is a little bit less clear what you're going for, because it uses a sort of manual outer join, but it allows the nested join to use an index as well. In the same test scenario as above, this performed the same query in 0.9ms (almost twice as fast). That was in a very small database, with only a few hundred records. With millions of records, the difference would be noticeable.

Thanks to this StackOverflow answer, for the join pattern.

Community
  • 1
  • 1
rmhunter
  • 581
  • 2
  • 9