1

I have the following JOIN query:

SELECT
    table1.*, 
    table2.*
FROM 
    Table1 AS table1 
LEFT JOIN 
    Table2 AS table2 
USING 
    (col1)
LEFT JOIN 
    Table3 as table3 
USING 
    (col1) 
WHERE 
    3963.191 * 
    ACOS(
    (SIN(PI() * $usersLatitude / 180) * SIN(PI() * table3.latitude / 180)) 
    +
    (COS(PI() * $usersLatitude / 180) * COS(PI() * table3.latitude / 180) * COS(PI() * table3.longitude / 180 - PI() * 37.1092162 / 180))
    ) <= 10 
AND 
    table1.col1 != '1' 
AND 
    table1.col2 LIKE 'A' 
AND 
    (table1.col3 LIKE 'X' OR table1.col3 LIKE 'X-Y') 
AND 
    (table2.col4 = 'Y' OR table2.col5 = 'Y') 

It executes in under 0.15 seconds.

However, if I simply add:

ORDER BY 
    table1.col6 DESC 

It executes in over 3 seconds.

All columns in the query are indexed, including the table1.col6 used in the ORDER BY.

I tried this solution, but it did not work.

How can I get this query to run fast with the ORDER BY.


EDIT:

RESULTS OF EXPLAIN EXTENDED WITHOUT ORDER BY:

id  select_type table   type    possible_keys   key key_len ref rows    filtered    Extra
1   SIMPLE  table1  ALL PRIMARY,col2,col3   NULL    NULL    NULL    140101  72.61   Using where
1   SIMPLE  table2  eq_ref  PRIMARY,col4,col5   PRIMARY 4   table1.col1 1   100 Using where
1   SIMPLE  table3  eq_ref  PRIMARY PRIMARY 4   table1.col1 1   100 Using where

RESULTS OF EXPLAIN EXTENDED WITH ORDER BY:

id  select_type table   type    possible_keys   key key_len ref rows    filtered    Extra
1   SIMPLE  table1  ALL PRIMARY,col2,col3   NULL    NULL    NULL    140101  72.61   Using where; Using filesort
1   SIMPLE  table2  eq_ref  PRIMARY,col4,col5   PRIMARY 4   table1.col1 1   100 Using where
1   SIMPLE  table3  eq_ref  PRIMARY PRIMARY 4   table1.col1 1   100 Using where

EDIT 2:

Data Types of all columns in the query (as requested):

col1: int(11)
col2: char(1)
col3: varchar(3)
col4: char(1)
col5: char(1)
col6: int(11)
latitude: varchar(25)
longitude: varchar(25)

All 3 tables (table1, table2, and table3) are MyISAM.

Community
  • 1
  • 1
ProgrammerGirl
  • 3,157
  • 7
  • 45
  • 82
  • 1
    Have you looked at the execution plan? It may be that the indexes you have specified don't cover the query or aren't being used. I'd suspect the 'LIKE' operators might have something to do with it – Charleh Nov 22 '12 at 16:57
  • @Charleh: How can I look at the execution plan? How would the indexes I have specified not cover the query or not be used if they were done for each column used in the query? Thanks in advance for the additional insight. – ProgrammerGirl Nov 22 '12 at 17:00
  • Have a look at that http://dev.mysql.com/doc/refman/5.0/en/explain.html MySql documentation page. Try that, and see if the indexes are used, it's the only real way to be sure! – Charleh Nov 22 '12 at 17:02
  • Also look at http://dev.mysql.com/doc/refman/5.5/en/explain-output.html - a good bit of info is: `You can get a good indication of how good a join is by taking the product of the values in the rows column of the EXPLAIN output. This should tell you roughly how many rows MySQL must examine to execute the query. If you restrict queries with the max_join_size system variable, this row product also is used to determine which multiple-table SELECT statements to execute and which to abort. See Section 8.11.2, “Tuning Server Parameters”.` So try with and without the order by and check the results – Charleh Nov 22 '12 at 17:06
  • @Charleh: OK, doing the EXPLAIN now with and without the ORDER BY and will post the results in just a moment. Thanks. – ProgrammerGirl Nov 22 '12 at 17:11
  • @Charleh: I have added the EXPLAIN for each query as an edit to the main post. Please advise what insight you gather from those results as they do not mean much to my inexperienced self. Thanks. – ProgrammerGirl Nov 22 '12 at 17:22
  • The filesort is likely your problem. An index on all the table1 columns that are in your where's and order by (all in one index) might help you here. – BenOfTheNorth Nov 22 '12 at 17:28
  • @BenGriffiths: Interesting, so you think a **composite index** on `col1`, `col2`, `col3`, and `col6` in `Table1` should resolve this issue? – ProgrammerGirl Nov 22 '12 at 17:53
  • I don't know 100% (hence not putting it as an answer), but its worth trying :) – BenOfTheNorth Nov 22 '12 at 17:54
  • @BenGriffiths: OK, I'll try that. Please let me know if you have any other ideas. Thanks! – ProgrammerGirl Nov 22 '12 at 17:57
  • @BenGriffiths: Unfortunately, adding a composite index on `col1`, `col2`, `col3`, and `col6` in `Table1` did not make any difference. Please let me know if you have any other ideas. Thanks. – ProgrammerGirl Nov 22 '12 at 18:03
  • I don't suppose you could edit your answer to list the types of the columns, particularly col6? Depending on the type, it may be that MySQL is simply ignoring the index because it can't use it. This is the case with some, such as large `VARCHAR` fields. – Keith Gaughan Nov 22 '12 at 18:12
  • @KeithGaughan: Just edited my post as per your request. `col6` is `int(11)`. Please let me know if that reveals any further insight? Thanks. – ProgrammerGirl Nov 22 '12 at 22:07

5 Answers5

1

This isn't a solution, but it will likely help: avoid using '*' to specify the columns you want and specify them explicitly. When you use * to specify all columns, MySQL has to do more work shunting data around when it's sorting the data, thus more likely to do a filesort, which is slow.

Also, indexing has little or no impact on how easily MySQL can order data, though you should check that it's using BTREE indexes as it's more likely they'll be kept relatively well-ordered.

Worst comes to worst, you might want to take a read of what the MySQL manual has to say about ORDER BY optimisation if you haven't done so already. I can't see anything that would be applicable, but you might see something that I'm missing.

Keith Gaughan
  • 21,367
  • 3
  • 32
  • 30
  • I agree with what you are saying, but it doesn't make any difference in this case. For example, instead of `SELECT table1.*, table2.*` I tried just `SELECT table1.col1` and it still took over 3 seconds. Please let me know if you have any other ideas. Thanks! – ProgrammerGirl Nov 22 '12 at 17:56
  • Nope, I'm tapped out. I'll edit the answer to mention the MySQL manual page on optimising `ORDER BY`, but I see much in there that might help. – Keith Gaughan Nov 22 '12 at 18:01
  • Thanks. Sorry I couldn't be more help. – Keith Gaughan Nov 22 '12 at 18:04
1

The only other way I can think off to get around this would be to alter the table to default the order by - and then remove the order by from your query all together.

http://dev.mysql.com/doc/refman/5.1/en/alter-table.html

BenOfTheNorth
  • 2,904
  • 1
  • 20
  • 46
  • That's a last resort, which I hope to avoid. Any ideas why the simple addition of the ORDER BY would make the query more than 10x longer? – ProgrammerGirl Nov 22 '12 at 22:08
  • Because the default order is by primary key, and to reorder it on another column required the filesort, which by their nature are quite slow. Adding the DESC order on top of that slows things down even further. Putting it simply, MySQL has a lot more work to do because of it all. – BenOfTheNorth Nov 22 '12 at 22:40
  • What's strange is that I use ORDER BY DESC in several other places on my site, but it doesn't slow it down anywhere near as much as with this query. There is something specific to this query that is causing it to significantly slow down with the ORDER BY. It's a mystery I hope to solve here. :) – ProgrammerGirl Nov 22 '12 at 23:25
0

If your query is as fast as you indicated, I would expect it is returning a rather small data set. If that's the case, I would wrap yours as-is and put the ORDER BY OUTIDE it such as...

select
      PreQuery.*
   from
      ( Your Entire Query Without the order by clause ) as PreQuery
   order by
      PreQuery.Col6 Desc

This way, the outer query has nothing to do with existing indexes and just returns the data... then that result will have the order applied per the raw results.

Hope that resolves it for you.

DRapp
  • 47,638
  • 12
  • 72
  • 142
  • Isn't what you're proposing the same as this solution: http://stackoverflow.com/questions/884661/slow-query-when-using-order-by ? If so, then as mentioned in my main post, that solution did not improve the speed of the query. Any other ideas? – ProgrammerGirl Nov 22 '12 at 22:00
0

Aside from my other answer offering to wrap your existing query inside a select, then order the results. Here's another option for the query... Correct me if my interpretation is incorrect.

You are doing a LEFT-JOIN to table 2, yet requiring Col4 or Col5 = 'Y' thus indicating an INNER JOIN (record MUST exist on Table1 AND Table2) due to the where clause.

Similarly on your Table 3 joined to table 1, but restricting the distance <= 10, so a LEFT-JOIN is not really accurate as it was part of the WHERE, indicating REQUIRED.

So, my revised query is listed below. I've basically moved the "WHERE" components to the respective table JOIN command. If you really DID mean to have as LEFT-JOIN and want all Table 1 records REGARDLESS of Table2/Table3 records, then you can simply change these to LEFT JOINs.

SELECT STRAIGHT_JOIN
      T1.*, 
      T2.*
   FROM 
      Table1 AS T1
         JOIN Table2 AS T2
            ON T1.Col1 = T2.Col1
            AND ( T2.Col4 = 'Y' OR T2.Col5 = 'Y' )
         JOIN Table3 as T3
            ON T1.Col1 = T3.Col1
            AND 3963.191 
               * ACOS(  (SIN(PI() * $usersLatitude / 180) * SIN(PI() * T3.latitude / 180)) 
                                + (  COS(PI() * $usersLatitude / 180) * COS(PI() * T3.latitude / 180) 
                                   * COS(PI() * table3.longitude / 180 - PI() * 37.1092162 / 180)
                        )   
                     ) <= 10 
   WHERE
          T1.Col2 LIKE 'A'
      AND ( T1.col3 LIKE 'X' OR T1.col3 LIKE 'X-Y') 
      AND T1.Col1 != '1'
   ORDER BY
      T1.Col6

Add Column 6 to table 1's index if applying the order by clause directly

On Table 1, I would have an index on ( Col2, Col3, Col1, Col6 )
On Table 2 on ( Col1, Col4, Col5 )
On Table 3 on ( Col1 )   <-- assume already had this.
DRapp
  • 47,638
  • 12
  • 72
  • 142
  • Very interesting, thanks! I'll give this a shot and report back. – ProgrammerGirl Nov 22 '12 at 22:10
  • Where's the `ORDER BY` to sort the results? – ProgrammerGirl Nov 22 '12 at 22:15
  • @Programmer, I would just do a quick test WITHOUT the where to see how it compares with your original (without the order by), then, add order by AFTER the WHERE clause as you normally would. – DRapp Nov 22 '12 at 22:35
  • OK, I tried that without the Indexes you suggested, and there was no improvement in speed. Could you please show me the SQL code for doing the composite index on Table1 that you suggested so I can add all the composite indexes you recommended to see if that helps? Thanks! – ProgrammerGirl Nov 22 '12 at 23:23
  • @programmer, create index NameOfIndex on Table1 ( colX, colY, colZ, colN ) – DRapp Nov 22 '12 at 23:31
  • Thanks, but I'm afraid this solution is actually worse. After adding the indexes you suggested and running your query above WITHOUT the ORDER BY, it actually took over **3 seconds** to complete, whereas my original query WITHOUT the ORDER BY still completes in about **0.2 seconds**. Any other ideas? Thanks. – ProgrammerGirl Nov 23 '12 at 14:13
  • @Programmer, Sorry, but having alternative thinking (as I too would do), is what its all about... – DRapp Nov 23 '12 at 14:15
0

Why are you doing LIKEs on non-wildcard values? That's going to be making the planner be more pessimistic. But mostly, you need to look at what the query planner is showing.

Also, have you done an ANALYZE on the tables recently? If the row statistics are off, the planner will make poor choices.

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
  • Thanks for your feedback. What do you mean by "Why are you doing LIKEs on non-wildcard values?" ? What should I be doing and why? Also, how do I do an ANALYZE, and how will I know if the row statistics are off? Thanks again (in advance) for your expertise. – ProgrammerGirl Nov 23 '12 at 14:05
  • `table1.col3 LIKE 'X'` has no wildcard in it, like `%`. `col LIKE 'ABC%'` matches every value that starts with the characters ABC. If you're not using wildcards, then the `LIKE` will work, but it will be doing more work than an `=`. As to how to `ANALYZE`, it's covered in the manual for your database software, whatever that may be. – Andy Lester Nov 24 '12 at 04:55
  • I did an `ANALYZE TABLE` on all 3 tables, and they all reported `OK`. I then replaced every `LIKE` with `=` (as you suggested) and it actually made the query WITHOUT the `ORDER BY` go from **0.2 seconds** to **3 seconds**. In other words, replacing `LIKE` with `=` makes the original query take just as long as adding `ORDER BY`! How is that possible, considering `LIKE` does more work than `=` as you mentioned? Perhaps therein lies the clue as to why the `ORDER BY` takes so long? Please advise. Thanks. – ProgrammerGirl Nov 24 '12 at 12:46
  • None of us can tell specifics of your situation because you really haven't posted that much. The code you posted isn't even real code, but your summary of it. I suggest that you edit your original posting to include real table definitions and the real query. Be sure to show the definitions for all your indexes as well. – Andy Lester Nov 24 '12 at 14:41