0

I'm just getting into optimizing queries by logging slow queries and EXPLAINing them. I guess the thing is... I'm not sure exactly what kind of things I should be looking for.... I have the query

SELECT DISTINCT
       screenshot.id,
       screenshot.view_count
  FROM screenshot_udb_affect_assoc
INNER JOIN screenshot ON id = screenshot_id
     WHERE unit_id = 56 
  ORDER BY RAND() 
     LIMIT 0, 6;

Looking at these two elements.... where should I focus on optimization?

id  select_type table   type    possible_keys   key key_len ref rows    Extra
1   SIMPLE  screenshot  ALL PRIMARY NULL    NULL    NULL    504 Using temporary; Using filesort
1   SIMPLE  screenshot_udb_affect_assoc ref screenshot_id   screenshot_id   8   source_core.screenshot.id,const 3   Using index; Distinct
Ben
  • 60,438
  • 111
  • 314
  • 488

3 Answers3

3

To begin with please refrain using ORDER BY RAND(). This in particular degrades performance when the table size is large. For example, even with limit 1 , it generates number of random numbers equal to the row count, and would pick the smallest one. This might be inefficient if table size is large or bound to grow. Detailed discussion on this can be found at: http://www.titov.net/2005/09/21/do-not-use-order-by-rand-or-how-to-get-random-rows-from-table/

Lastly, also ensure that your join columns are indexed.

kuriouscoder
  • 5,394
  • 7
  • 26
  • 40
  • I see his example and it's brilliant, except that I have to make sure the rows meet my specific criteria..... – Ben Nov 07 '10 at 04:38
  • Under 100K records, it's fine to use `ORDER BY RAND()` -- over that, and you want to start looking at options that scale better. For more info [see this article](http://www.dasprids.de/blog/2008/06/07/fetching-random-rows-of-mysql-efficiently) – OMG Ponies Nov 07 '10 at 04:39
1

Try:

  SELECT s.id,
         s.view_count
    FROM SCREENSHOT s
   WHERE EXISTS(SELECT NULL
                  FROM SCREENSHOT_UDB_AFFECT_ASSOC x
                 WHERE x.screenshot_id = s.id)
ORDER BY RAND()
   LIMIT 6

Under 100K records, it's fine to use ORDER BY RAND() -- over that, and you want to start looking at alternatives that scale better. For more info, see this article.

Community
  • 1
  • 1
OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
  • I did that, it yields the same results except it also has `Using where;` listed in addition to `Using temporary; Using filesort` – Ben Nov 09 '10 at 14:51
1

I agree with kuriouscoder, refrain from using ORDER BY RAND(), and make sure each of the following fields are indexed in a single index:

screenshot_udb_affect_assoc.id

screenshot.id

screenshot.unit_id

do this using code like:

create index Index1 on screenshot(id):

mariana soffer
  • 1,853
  • 12
  • 17