1

I have a query that is getting counts from multiple tables by using a LEFT JOIN and subqueries. The idea is to get a count various activites a member has participated in.

The schema looks like this:

member PK member_id

table1 PK tbl1_id FK member_id

table2 PK tbl2_id FK member_id

table3 PK tbl3_id FK member_id

My query looks like this:

SELECT t1.num1,t2.num2,t3.num3
FROM member m
LEFT JOIN
(
   SELECT member_id,COUNT(*) as num1
   FROM table1
   GROUP BY member_id
) t1 ON t1.member_id = m.member_id
LEFT JOIN
(
   SELECT member_id,COUNT(*) as num2
   FROM table2
   GROUP BY member_id
) t2 ON t2.member_id = m.member_id
LEFT JOIN
(
   SELECT member_id,COUNT(*) as num3
   FROM table3
   GROUP BY member_id
) t3 ON t3.member_id = m.member_id
WHERE m.member_id = 27

Where 27 is a test id. The actual query joins more than three tables and the query is run multiple times with the member_id being changed. The problem is this query runs pretty slow. I get the info I need but I am wondering if anyone could suggest a way to optimize this. Any advice is very much appreciated. Thanks much.

TheMethod
  • 2,893
  • 9
  • 41
  • 72

3 Answers3

3

You should refactor your query. You can do this by reordering the way the query collects the data. How?

  • Apply the WHERE clause first
  • Apply JOINs last

Here is your original query:

SELECT t1.num1,t2.num2,t3.num3 
FROM member m 
LEFT JOIN 
( 
   SELECT member_id,COUNT(*) as num1 
   FROM table1 
   GROUP BY member_id 
) t1 ON t1.member_id = m.member_id 
LEFT JOIN 
( 
   SELECT member_id,COUNT(*) as num2 
   FROM table2 
   GROUP BY member_id 
) t2 ON t2.member_id = m.member_id 
LEFT JOIN 
( 
   SELECT member_id,COUNT(*) as num3 
   FROM table3 
   GROUP BY member_id 
) t3 ON t3.member_id = m.member_id 
WHERE m.member_id = 27 

Here is you new query

SELECT
   IFNULL(t1.num1,0) num1,
   IFNULL(t1.num2,0) num2,
   IFNULL(t1.num3,0) num3
FROM
(
   SELECT * FROM member m 
   WHERE member_id = 27
) 
LEFT JOIN 
( 
   SELECT member_id,COUNT(*) as num1 
   FROM table1 
   WHERE member_id = 27
   GROUP BY member_id 
) t1 ON t1.member_id = m.member_id 
LEFT JOIN 
( 
   SELECT member_id,COUNT(*) as num2 
   FROM table2 
   WHERE member_id = 27
   GROUP BY member_id 
) t2 ON t2.member_id = m.member_id 
LEFT JOIN 
( 
   SELECT member_id,COUNT(*) as num3 
   FROM table3 
   WHERE member_id = 27
   GROUP BY member_id 
) t3 ON t3.member_id = m.member_id 
;

BTW I changed member m into SELECT * FROM member m WHERE member_id = 27 in case you need any information about member 27. I also added the IFNULL function to each result to produce 0 in case count is NULL.

You need to make absolutely sure

  • member_id is the primary key of the member table
  • member_id is indexed in table1, table2, and table3

Give it a Try !!!

RolandoMySQLDBA
  • 43,883
  • 16
  • 91
  • 132
  • Do you have any evidence on this? This won't make a difference in my experience... The WHERE clause does filter the first table before it is joined on, and the joins implicitly carry that condition through. Also, to my knowledge, the optimiser won't blindly aggregate records it is aware it will discard. – MatBailie Dec 16 '11 at 16:59
  • @Dems : I actually have an empirical example. Here is post I made on how to refactor a query in spite of what the MySQL Optimizer says: http://stackoverflow.com/a/6023217/491757. – RolandoMySQLDBA Dec 16 '11 at 17:03
  • I only have time to skim through that. I'm not convinced it's the same mechanism at play. All I can recommend is for the OP to try it on real data and see. Then if it work, give @RolandoMySQLDBA the credit :) – MatBailie Dec 16 '11 at 17:07
  • 1
    @dems - Of course it will work, but it's only because the search criteria into is in each sub-select, not because of a re-ordering of the where clause. It will make the query impossible to read for anything other than the simplest conditions. – Brian Hoover Dec 16 '11 at 17:17
2

Without knowing your schema and what you've done for indexes, one POSSIBLE way to make this faster is:

SELECT (select ifnull(count(*),0) from table1 where table1.member_id = m.id) as num1,
 (select ifnull(count(*),0) from table2 where table2.member_id = m.id) as num2,
 (select ifnull(count(*),0) from table3 where table3.member_id = m.id) as num3
from member m
WHERE m.member_id = 27

Now, this is a slightly risky recommendation, simply because I don't know anything about your DB or what else is running, or where the bottlenecks are.

In general, it would be a good idea to post an explain plan with your query to get a better answer.

Brian Hoover
  • 7,861
  • 2
  • 28
  • 41
  • Correlated sub-queries rarely perform beneficially in comparison to JOINs in the OPs format. In fact, had the OP written the example given here, I'd have suggested the format the OP currently uses. – MatBailie Dec 16 '11 at 17:00
  • @Dems - At least in my test system OPs format resulted in full table scans for table1, table2 and table3, where my suggestion used the indexes correctly. – Brian Hoover Dec 16 '11 at 17:12
  • +1 because your answer is way more concise. I admit my answer comes with a little excess baggage. My inserting `m.member_id = 27` into each subquery just makes the temp tables faster to get IMHO. If a PHP developer codes this baggage, there will be benefits. Your query skips all that convolution. – RolandoMySQLDBA Dec 16 '11 at 17:25
0
SELECT num1, num2, count(*) as num3
FROM (
  SELECT member_id, num1, count(*) as num2
  FROM (
    SELECT member_id, count(*) as num1
    FROM member
    LEFT JOIN  table1 USING (member_id)
    WHERE member_id = 27) as m1
  LEFT JOIN table2 USING (member_id)) as m2
LEFT JOIN table3 USING (member_id);
newtover
  • 31,286
  • 11
  • 84
  • 89