4

There is such code(using PublicActivity gem & Squeel)

  def index
    @activities = Activity.limit(20).order { created_at.desc }
    @one = @activities.where{trackable_type == 'Post'}.includes(trackable: [:author, :project])
    @two = @activities.where{trackable_type == 'Project'}.includes trackable: [:owner]
    @activities = @one + @two
  end

But it creates 8 SQL requests:

 SELECT "activities".* FROM "activities" WHERE "activities"."trackable_type" = 'Post' ORDER BY "activities"."created_at" DESC LIMIT 20

      SELECT "posts".* FROM "posts" WHERE "posts"."id" IN (800, 799, 798, 797, 796, 795, 794, 793, 792, 791, 790, 789, 788, 787, 786, 785, 784, 783, 782, 781)

      SELECT "users".* FROM "users" WHERE "users"."id" IN (880, 879, 878, 877, 876, 875, 874, 873, 872, 871, 869, 868, 867, 866, 865, 864, 863, 862, 861, 860)

      SELECT "projects".* FROM "projects" WHERE "projects"."id" IN (80, 79)

      SELECT "activities".* FROM "activities" WHERE "activities"."trackable_type" = 'Project' ORDER BY "activities"."created_at" DESC LIMIT 20

      SELECT "projects".* FROM "projects" WHERE "projects"."id" IN (80, 79, 78, 77, 76, 75, 74, 73, 72, 71, 70, 69, 68, 67, 66, 65, 64, 63, 62, 61)

     SELECT "users".* FROM "users" WHERE "users"."id" IN (870, 859, 848, 837, 826, 815, 804, 793, 782, 771, 760, 749, 738, 727, 716, 705, 694, 683, 672, 661)
  1. activites request are not joined
  2. some users (post owner and project owner) are loaded twice
  3. some projects are loaded twice
  4. @activities is Array. Rails relations merge methods(except +) don't work with the code above.

Any ideas to optimize it?

just so
  • 1,088
  • 2
  • 11
  • 23
  • you can always execute a raw query in rails. Simply construct the sql string and then you can parse the returned data set on the requesting side ... which will be much faster than doing 8 queries – Chris Oct 11 '13 at 16:41
  • 1) It's not rails-way. 2) I'm not very good with SQL, and don't really know how should query view at the end:) – just so Oct 11 '13 at 17:02
  • Hmmmmmmmmmmm, disagree about it not being the rails way. Anything outside of cookie-cutter queries and you'll have to use a custom where clause anyhow. The helper functions just add a bit of abstraction and portability – Chris Oct 11 '13 at 17:04
  • @Chris Can you hint how should completed SQL query look? – just so Oct 11 '13 at 18:28
  • Did my answer help at all? – MrYoshiji Oct 21 '13 at 13:07
  • Please see I added a note after CMW's great comment. If you are using Rails 4 vice an earlier version, you do have a way to reduce the number of queries with `eager_load`. It would interesting to see whether this produces a meaningful speed increase. You should give the bounty to CMW if it does! – Gene Oct 26 '13 at 00:28

5 Answers5

2

A non-rails-4, non-squeel solution is:

def index
  @activities = Activity.limit(20).order("created_at desc")
  @one = @activities.where(trackable_type: 'Post')   .joins(trackable: [:author, :project]).includes(trackable: [:author, :project])
  @two = @activities.where(trackable_type: 'Project').joins(trackable: [:owner])           .includes(trackable: [:owner])
  @activities = @one + @two
end

The combination of joins and includes looks odd, but in my testing it works surprisingly well.

This'll reduce it down to two queries though, not to one. And @activities will still by an array. But maybe using this approach with squeel will solve that, too. I don't use squeel and can't test it, unfortunately.

EDIT: I totally missed the point of this being about polymorphic associations. The above works to force

If you want to use what AR offers, it's a bit hacky but you could define read-only associated projects and posts:

belongs_to :project, read_only: true, foreign_key: :trackable_id
belongs_to :post,    read_only: true, foreign_key: :trackable_id

With those the mentioned way of forcing eager loads should work. The where conditions are still needed, so those associations are only called on the right activities.

def index
  @activities = Activity.limit(20).order("created_at desc")
  @one = @activities.where(trackable_type: 'Post')   .joins(post: [:author, :project]).includes(post: [:author, :project])
  @two = @activities.where(trackable_type: 'Project').joins(project: [:owner])        .includes(project: [:owner])
  @activities = @one + @two
end

It's no clean solution and the associations should be attr_protected to make sure they aren't set accidentally (that will break polymorphism, I expect), but from my testing it seems to work.

CMW
  • 3,093
  • 1
  • 16
  • 20
  • It doesn't work: `Can not eagerly load the polymorphic association :trackable`. I'm using rails 4 – just so Oct 25 '13 at 18:51
  • Oh, I totally missed the fact that it was about a polymorphic relation. My bad. I'm not sure there is eager loading with rails' approach to polymorphism. Let me suggest a different approach. – CMW Oct 25 '13 at 19:52
1

Using a simple Switch case in SQL:

def index
  table_name = Activity.table_name
  @activities = Activity.where(trackable_type: ['Post', 'Project'])
                        .order("CASE #{table_name}.owner_type WHEN 'Post' THEN 'a' ELSE 'z' END, #{table_name}.created_at DESC")
end

Then you can easily add the includes you want ;)

MrYoshiji
  • 54,334
  • 13
  • 124
  • 117
  • 1. Code above doesn't work 2. Right now I have no idea **how** can this SQL work, so I will go and read about SQL `case`:) – just so Oct 25 '13 at 20:16
1

I believe you will need at least two AR query invocations (as you currently have) because of the limit(20) clause. Your queries currently gives you up to 20 Posts, AND up to 20 Projects, so doing an aggregate limit on both activity types in a single query would not give the intended result.

I think all you need to do is use eager_load in the query instead of includes to force a single query. The differences between joins, includes, preload, eager_load and references methods are nicely covered here

So, with AR and squeel:

def index
    @activities = Activity.limit(20).order { created_at.desc }
    @one = @activities.where{trackable_type == 'Post'}.eager_loads(trackable: [:author, :project])
    @two = @activities.where{trackable_type == 'Project'}.eager_loads trackable: [:owner]
    @activities = @one + @two
end

And without the squeel, using just regular ActiveRecord 4:

def index
    @activities = Activity.limit(20).order(created_at: :desc)
    @one = @activities.where(trackable_type: 'Post').eager_loads(trackable: [:author, :project])
    @two = @activities.where(trackable_type: 'Project').eager_loads(trackable: :owner)
    @activities = @one + @two
end

You don't need squeel, I recently ripped it out of my project because it doesn't work properly for a number of complex queries in my experience, where AR 4 and Arel were ok.

Andrew Hacking
  • 6,296
  • 31
  • 37
1

In a nutshell, you can't optimize any further without using SQL. This is the way Rails does business. It doesn't allow access to join fields outside the AR model where the query is posed. Therefore to get values in other tables, it does a query on each one.

It also doesn't allow UNION or fancy WHERE conditions that provide other ways of solving the problem.

The good news is that these queries are all efficient ones (given that trackable_type is indexed). If the size of the results is anything substantial (say a few dozen rows), the i/o time will dominate the slight additional overhead of 7 simple queries vice 1 complex one.

Even using SQL, it will be difficult to get all the join results you want in one query. (It can be done, but the result will be a hash rather than an AR instance. So dependent code will be ugly.) The one-query-per-table is wired pretty deeply into Active Record.

@Mr.Yoshi's solution is a good compromise using minimal SQL except it doesn't let you selectively load either author or project+owner based on the trackable_type field.

Edit

The above is all correct for Rails 3. For Rails 4 as @CMW says, the eager_load method will do the same as includes using an outer join instead of separate queries. This is why I love SO! I always learn something.

Gene
  • 46,253
  • 4
  • 58
  • 96
  • In the edit to my answer I've pointed out a way to do it in two queries (or maybe in one, with sqeel) without writing your own SQL. Basically what's getting in your way here is that `:trackable` is a polymorphic relationship. But since in this case there are only two specific variants of the association that are needed, those can be put into code, removing all problems with joins that polymorphism did put in place. – CMW Oct 25 '13 at 21:59
0

That's a pretty big query there ... by the looks of it you could do it in one select, but for readability I'll use two, one for projects and one for posts.

This assumes a 1:1 relationship between activity and post/project. If this isn't correct, the problem can be solved with a subquery

select * from activities a
where a.trackable_type = 'Post'
left join posts p
on p.id = a.trackable_id -- or whatever fields join these two tables
left join users u
on a.user_id = u.id --this is joining to the main table, may want to join trackable, not sure
left join projects p
on a.project_id = p.id
order by a.created_at DESC LIMIT 20

Or, if there is a 1:many relationship, something like this:

select * from
(   select * from activities a
    where a.trackable_type = 'Post'
    order by a.created_at DESC LIMIT 20 ) activities
left join posts p
...

Edit: As I read this, I realize that I'm a bit old fashioned .... I think if you were going to use such large raw sql queries, you should make a database function, rather than coding it into your application

Chris
  • 846
  • 6
  • 16