MySQL statement that was working fine previously all of a sudden started being very slow. It includes alot of LEFT JOINs

Question

MySQL statement on a school system I have created some years back was working fine but now takes close to 30 seconds to pull what appears to me a simple statement but cannot figure out how to improve it. I am wondering if anyone can help me re-write this statement for a faster response. The statement is:

SELECT es.*,c.mainsubarea AS subject, b.name,b.email,GROUP_CONCAT(doc.document_file SEPARATOR "|") document_file 
    FROM usersubinfo es 
        LEFT JOIN userinfo b ON (es.uid=b.uid) 
        LEFT JOIN lkptsubjectarea c ON (es.mainsubjectarea=c.id) 
        LEFT JOIN lkptdeliverytime d ON (es.deliverytime = d.id) 
        LEFT JOIN documents doc ON (es.id = doc.order_id) 
    WHERE es.id AND es.is_active='Yes' 
    GROUP BY es.id 
    ORDER BY es.joindate 
    DESC LIMIT 0,25

See screenshot from Explain statement on phpMyAdmin

@Akina es.id is auto-increment integer so is not and cannot be zero in the existing data — Abdi Sultan, Dec 21 '21 at 13:46
*es.id is auto-increment integer so is not and cannot be zero in the existing data* If so then what's the meaning of `WHERE es.id AND es.is_active='Yes'`? taking into account the operators precedence it is `WHERE (es.id AND (es.is_active='Yes'))`. — Akina, Dec 21 '21 at 13:50
If this is PK then limit to 25 rows in subquery then join and group by. — Akina, Dec 21 '21 at 13:51
@Akina Thanks for pointing out - WHERE es.id is redundant but it does not make any difference to the speed of the statement. — Abdi Sultan, Dec 21 '21 at 14:04
@Akina "If this is PK then limit to 25 rows in subquery then join and group by." Can you help me visualize how you would achieve that - I am not sure what you mean by subquery limiting. — Abdi Sultan, Dec 21 '21 at 14:06
`SELECT ... FROM (SELECT * FROM usersubinfo WHERE is_active='Yes' ORDER BY joindate DESC LIMIT 0,25) es LEFT JOIN userinfo b ... ORDER BY es.joindate;` — Akina, Dec 21 '21 at 16:41

Rick James · Answer 1 · 2021-12-21T17:42:29.300

Add Indexes: These may help:

b:  INDEX(uid,  name, email)
doc:  INDEX(order_id,  document_file)

Remove LEFT: Is there a reason for LEFT JOIN instead of JOIN? I think not. See if you get the same results without LEFTs.

Remove bogus test: Why WHERE es.id? If id is the PRIMARY KEY of es, that test will always be true.

Improve GROUP+ORDER: Change

    GROUP BY  es.id
    ORDER BY  es.joindate DESC
    LIMIT  0,25

-->

    GROUP BY  es.joindate,      es.id
    ORDER BY  es.joindate DESC, es.id DESC
    LIMIT  0,25

That avoids two passes over the data -- one for the GROUPing, and another for the ORDERing. Meanwhile, I assume that my grouping and ordering is "just as good".

Turn inside-out: That brings up another issue, one I call "explode-implode". That's where you Join together a lot of rows, only to get rid of most of them. So...

Start by finding the 25 ids desired with as little effort as possible:

SELECT  id
    FROM usersubinfo
    WHERE is_active = 'Yes'
    GROUP BY  joindate,      id
    ORDER BY  joindate DESC, id DESC
    LIMIT  0,25

And include that as a 'derived' table in the rest:

SELECT  es.*, c.mainsubarea AS subject,
        b.name, b.email,
        GROUP_CONCAT(doc.document_file SEPARATOR "|") document_file
    FROM ( put the above Select here 
         )  AS ids
    JOIN  usersubinfo AS es  USING(id)
    JOIN  userinfo b  ON (es.uid=b.uid)
    JOIN  lkptsubjectarea c  ON (es.mainsubjectarea=c.id)
    JOIN  lkptdeliverytime d  ON (es.deliverytime = d.id)
    JOIN  documents doc  ON (es.id = doc.order_id)
    ORDER BY  joindate DESC, id DESC;  -- yes, repeat this

This is likely to be faster because the tables other than usersubinfo will be touched only 25 times.

(I think this will happen to avoid the "only_full_group_by" issue to which Parfait refers.)

score 0 · Answer 2 · answered Dec 21 '21 at 16:56

First and foremost, consider writing a valid ANSI SQL aggregate query with changes to your GROUP BY and SELECT clauses. Currently, your query includes only one column in GROUP BY but all columns from usersubinfo table with SELECT es.* in addition to other non-aggregated columns. You even order by a column not in GROUP BY.

Such a query runs against the SQL standard and will fail in most RDBMS's but allowable in MySQL due to its ONLY_FULL_GROUP_BY mode turned off which dangerously allows:

server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic, which is probably not what you want

Since you have an aggregate function, GROUP_CONCAT, all other non-aggregated columns should be placed in GROUP BY clause. If you need to add columns in SELECT add it also to GROUP BY. Also, you may have a redundant LEFT JOIN which serves no purpose for other JOINs or columns for SELECT.

SELECT es.id, 
       es.joindate,
       sa.mainsubarea AS subject,
       i.name,
       i.email,
       GROUP_CONCAT(doc.document_file SEPARATOR "|") document_file 
    FROM usersubinfo es 
        LEFT JOIN userinfo i ON (es.uid = i.uid) 
        LEFT JOIN lkptsubjectarea sa ON (es.mainsubjectarea = sa.id) 
        -- LEFT JOIN lkptdeliverytime dlv ON (es.deliverytime = dlv.id)  -- POSSIBLY REDUNDANT
        LEFT JOIN documents doc ON (es.id = doc.order_id) 
    WHERE es.id IS NOT NULL 
      AND es.is_active = 'Yes' 
    GROUP BY es.id,
             es.joindate,
             sa.mainsubarea,
             i.name,
             i.email,
    ORDER BY es.joindate DESC
    LIMIT 0 
    OFFSET 25

Additionally, by avoiding SELECT * you avoid bringing in unneeded even newer columns, allowing indexes to run effectively over large table scans, and avoid sending large amount of content over network. See Why is SELECT * considered harmful?

If `id` is the PK, won't that eliminate your comment about grouping? — Rick James, Dec 21 '21 at 17:41
Sure. But the other non-agg columns may not be functional dependent on `es.id`. Maybe not for efficiency but for readable/maintainable code, explicitly laying out `SELECT` columns would be ideal. But I defer to you, the mysql guru! — Parfait, Dec 21 '21 at 20:28

MySQL statement that was working fine previously all of a sudden started being very slow. It includes alot of LEFT JOINs

2 Answers2