5

This is the whole query...

SELECT s.*, (SELECT url FROM show_medias WHERE show_id = s.id AND is_primary = 1) AS media_url
FROM (shows As s)
WHERE `s`.`id` IN (
 SELECT DISTINCT st.show_id
 FROM show_time_schedules AS sts
 LEFT JOIN show_times AS st ON st.id = sts.show_time_id
 WHERE sts.schedule_date BETWEEN CAST('2012-01-10' AS date) AND CAST('2012-01-14' AS date)
 )
AND `s`.`is_active` = 1
ORDER BY s.name asc 

If...

SELECT url FROM show_medias WHERE show_id = s.id AND is_primary = 1
(0.0004 sec)

And...

 SELECT DISTINCT st.show_id
 FROM show_time_schedules AS sts
 LEFT JOIN show_times AS st ON st.id = sts.show_time_id
 WHERE sts.schedule_date BETWEEN CAST('2012-01-10' AS date) AND CAST('2012-01-14' AS date)
(0.0061 sec)

Is there an obvious reason....

SELECT s.*, (inner query 1) AS media_url
FROM (shows As s)
WHERE `s`.`id` IN ( inner query 2 )
AND `s`.`is_active` = 1
ORDER BY s.name asc

is taking 5.7245 sec?

EXPLAIN EXTENDED

id  select_type         table       type    possible_keys   key     key_len ref                     rows    filtered    Extra
1   PRIMARY             s           ALL     NULL            NULL    NULL    NULL                    151     100.00      Using where; Using filesort
3   DEPENDENT SUBQUERY  sts         ALL     NULL            NULL    NULL    NULL                    26290   100.00      Using where; Using temporary
3   DEPENDENT SUBQUERY  st          eq_ref  PRIMARY         PRIMARY 4       bvcdb.sts.show_time_id  1       100.00      Using where
2   DEPENDENT SUBQUERY  show_medias ALL     NULL            NULL    NULL    NULL                    159     100.00      Using where
jondavidjohn
  • 61,812
  • 21
  • 118
  • 158
  • Out of interest, have you looked at the performance by (INNER, LEFT depending on whether you want only shows with a url or not) JOINing onto show_medias on show_medias.show_id = s.id rather than doing the subquery in your select list? I'd be interested in seeing that. Anything in the EXPLAIN for the query? – dash Jan 08 '12 at 15:41
  • @dash thanks so much for your help so far, added the `EXPLAIN EXTENDED`, and your suggested query yielded almost the same if not a little longer 6.x seconds performance. – jondavidjohn Jan 08 '12 at 18:10
  • Do you know what indexes you have on your tables? In particular, are any of the id columns used in your query index on shows, show_time_schedules, show_times and show_medias? – dash Jan 08 '12 at 18:24
  • Out of interest, does removing the ORDER BY speed up your query? – dash Jan 08 '12 at 18:28
  • @dash my goodness... thanks for mentioning indexes... This is an old project that I did before I learned to index foreign id fields along with the primary... after applying the foreign key indexes the query is humming at 0.0064 sec... add that to your answer and we'll get this thing wrapped up... thanks! – jondavidjohn Jan 08 '12 at 18:29

2 Answers2

3

You can always use EXPLAIN or EXPLAIN EXTENDED to see what MySql is doing with a query

You could also write your query a slightly different way, have you tried the following?

SELECT        s.*, 
              sm.url AS media_url 
FROM          shows AS s
INNER JOIN    show_medias AS sm ON s.id = SM.show_id
WHERE `s`.`id` IN ( 
                        SELECT DISTINCT st.show_id 
                        FROM show_time_schedules AS sts 
                        LEFT JOIN show_times AS st ON st.id = sts.show_time_id 
                        WHERE sts.schedule_date BETWEEN CAST('2012-01-10' AS date) AND CAST('2012-01-14' AS date) 
                        ) 
AND            `s`.`is_active` = 1 
AND            sm.is_primary = 1
ORDER BY       s.name asc 

It would be interesting to see what the effect of that is. I would expect it to be faster as, at the moment, I think MySql will be running inner query 1 for each show you have (so that one query will be run many times. A join should be more efficient.)

Replace the INNER JOIN with a LEFT JOIN if you want all shows that don't have a row in show_medias.

EDIT:

I'll take a look at your EXPLAIN EXTENDED shortly, I also wonder if you want to try the following; it removes all of the subqueries:

SELECT        DISTINCT s.*,  
                       sm.url AS media_url  
FROM                   shows AS s 
INNER JOIN             show_medias AS sm ON s.id = SM.show_id
INNER JOIN             show_times AS st ON (s.id = st.show_id)
RIGHT JOIN             show_time_schedules AS sts ON (st.id = sts.show_time_id)

WHERE                  `s`.`is_active` = 1  
AND                    sm.is_primary = 1 
AND                    sts.schedule_date BETWEEN CAST('2012-01-10' AS date) AND CAST('2012-01-14' AS date)  
ORDER BY               s.name asc 

(It would also be good to see the EXPLAIN EXTENDED on these - you could add it to the comments for this one).

Further EDIT:

On your EXPLAIN EXTENDED (a good start on how to read these is here)

The USING FILESORT and USING TEMPORARY are both key indicators. Hopefully, the second query I recommend should remove any TEMPORARY tables (in the subquery). Try then leaving the ORDER BY off to see if that makes a difference (and we can add that to the findings so far :-)

I can also see that the query is potentially missing out on a lot of index lookups; all of your id columns are prime candidates for index matches (with the usual index caveats). I'd also try adding those indexes and then running EXPLAIN EXTENDED again to see what the difference is now (EDIT as we already know from your comment above!)

dash
  • 89,546
  • 4
  • 51
  • 71
2

Here comes the CTE-solution: (my bad, mysql does not have CTE's, but the problem is too generic)

WITH RECURSIVE tree AS (
    SELECT t0.id
        , t0.study_start_time
        , t0.study_end_time
    FROM tab t0
    WHERE NOT EXISTS (SELECT * FROM tab nx
           WHERE nx.id=t0.id 
           AND nx.study_end_time = t0.study_start_time
           )
    UNION
    SELECT tt.id
        ,tt.study_start_time
        ,t1.study_end_time
    FROM tab t1
    JOIN tree tt ON t1.id=tt.id
                AND t1.study_start_time = tt.study_end_time
    )
SELECT * FROM tree
WHERE NOT EXISTS (SELECT * FROM tab nx 
                WHERE nx.id=tree.id
                AND tree.study_end_time = nx.study_start_time
                )
ORDER BY id
    ;

Results:

CREATE TABLE
INSERT 0 15
  id  | study_start_time | study_end_time 
------+------------------+----------------
 1234 |              168 |            480
 2345 |              175 |            233
 2345 |              400 |            425
 4567 |              200 |            225
 4567 |              250 |            289
 4567 |              300 |            310
 4567 |              320 |            340
 4567 |              360 |            390
(8 rows)

QUERY plan (after adding obvious PK and index):

DROP TABLE
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "tab_pkey" for table "tab"
CREATE TABLE
CREATE INDEX
INSERT 0 15

                                                                QUERY PLAN                                                                 
-------------------------------------------------------------------------------------------------------------------------------------------
 Merge Anti Join  (cost=16209.59..16292.13 rows=6386 width=12) (actual time=0.189..0.193 rows=8 loops=1)
   Merge Cond: ((tree.id = nx.id) AND (tree.study_end_time = nx.study_start_time))
   CTE tree
     ->  Recursive Union  (cost=0.00..15348.09 rows=8515 width=12) (actual time=0.022..0.136 rows=15 loops=1)
           ->  Merge Anti Join  (cost=0.00..175.04 rows=1455 width=12) (actual time=0.019..0.041 rows=8 loops=1)
                 Merge Cond: ((t0.id = nx.id) AND (t0.study_start_time = nx.study_end_time))
                 ->  Index Scan using tab_pkey on tab t0  (cost=0.00..77.35 rows=1940 width=12) (actual time=0.010..0.018 rows=15 loops=1)
                 ->  Index Scan using sssss on tab nx  (cost=0.00..77.35 rows=1940 width=8) (actual time=0.003..0.008 rows=14 loops=1)
           ->  Merge Join  (cost=1297.04..1500.28 rows=706 width=12) (actual time=0.010..0.012 rows=1 loops=6)
                 Merge Cond: ((t1.id = tt.id) AND (t1.study_start_time = tt.study_end_time))
                 ->  Index Scan using tab_pkey on tab t1  (cost=0.00..77.35 rows=1940 width=12) (actual time=0.001..0.004 rows=9 loops=6)
                 ->  Sort  (cost=1297.04..1333.42 rows=14550 width=12) (actual time=0.006..0.006 rows=2 loops=6)
                       Sort Key: tt.id, tt.study_end_time
                       Sort Method: quicksort  Memory: 25kB
                       ->  WorkTable Scan on tree tt  (cost=0.00..291.00 rows=14550 width=12) (actual time=0.000..0.001 rows=2 loops=6)
   ->  Sort  (cost=726.15..747.44 rows=8515 width=12) (actual time=0.166..0.169 rows=15 loops=1)
         Sort Key: tree.id, tree.study_end_time
         Sort Method: quicksort  Memory: 25kB
         ->  CTE Scan on tree  (cost=0.00..170.30 rows=8515 width=12) (actual time=0.025..0.149 rows=15 loops=1)
   ->  Sort  (cost=135.34..140.19 rows=1940 width=8) (actual time=0.018..0.018 rows=15 loops=1)
         Sort Key: nx.id, nx.study_start_time
         Sort Method: quicksort  Memory: 25kB
         ->  Seq Scan on tab nx  (cost=0.00..29.40 rows=1940 width=8) (actual time=0.003..0.004 rows=15 loops=1)
 Total runtime: 0.454 ms
(24 rows)
wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • Does MySQL support this? If it does, that would be brilliant! However, I think you might have looked at the wrong question ;-) – dash Jan 08 '12 at 18:31
  • No, it does not, I'm sorry. I did not see the mysql tag or it was added later. CTE's are a great way to chase linked lists (that's why I retagged to Islands-and-gaps; IAG's are basically linked lists, so the CTE's are like a knee reflex for me) I'll add a query plan, just for fun... – wildplasser Jan 08 '12 at 18:35
  • Hence my reply ;-) however, I'd love to know what question this does belong to as that is a brilliant, well constructed and well supported answer. – dash Jan 08 '12 at 18:46
  • Here: http://stackoverflow.com/questions/8776944/reducing-table-to-avoid-space-inefficiency/8780227#8780227 I think I got confused by the other overly complex contributions ... – wildplasser Jan 08 '12 at 18:52