0

Possible Duplicate:
MySQL: Alternatives to ORDER BY RAND()

I'm pulling about 20 random rows, most or all with WHERE clauses attached, in a table of 1000~2000 records. This won't grow by much, maybe 150 every year.

I'm also looking at a few thousand users at most, but with this query being part of a popular area of the site.

Do I really need to be concerned about performance hits where ORDER BY RAND() is concerned? I've heard all the horror stories about performance issues, but I haven't found much when such a small table is being hit constantly.

I can't do a cache for this.

Community
  • 1
  • 1
  • 2
    `ORDER BY RAND()` is by definition the worst performing solution. That does not automatically mean it is *too slow*. In your case, it's obviously performant enough. Do you need to worry about it? Well, apparently not. Until you do. – deceze May 15 '12 at 03:03
  • For the dupe, I already browsed over that question before I made this one. It doesn't cover quite the same issue as I'm concerned about efficiency on a small table that gets hit a lot but doesn't update much. @deceze: I'm asking because it's the worst performing solution, but it's also the easiest for what I'm doing. However I don't want to turn into a database sadist. I'm basically looking for the best possible answer for this particular case because I couldn't reach a conclusion myself. :) –  May 15 '12 at 03:28

1 Answers1

2

The ORDER BY RAND() is easily used and forgotten, but you have to realize that MySQL needs to inspect every row because RAND() is not constant.

Inspecting 1000 rows shouldn't be a problem, but keep an eye on the query performance over time. If you just needed one record, you could have found an answer here: ORDER BY RAND() alternative

Edit

You said you can't do caching, but even a small generated .php file is a cache :)

For instance, you can store a list of ID's in a cache; you update it every time you insert or remove a database record. Then, you pull 20 random ID's using PHP and use those to query the database (using e.g. IN (4, 7, 10, 45, etc.)

Update

As pointed out by OP, this is not possible if the query also has conditions (i.e. not every item in the table is eligable to be picked).

Community
  • 1
  • 1
Ja͢ck
  • 170,779
  • 38
  • 263
  • 309
  • Well, the results are pulled into a PHP array, to be used as part of a game, which is why I said that it can't be cached. It does this, say, every new game, for every computer player. For the cache, I was more or less saying that I couldn't generate these and store them, so psuedorandom? Either way, the results need to be checked against the other computer players, which is why it can't happen. Unfortunately I need 20 random rows per each computer player. –  May 15 '12 at 03:09
  • I had also thought about using an offset, but that will only pull one record at a time, which seems like it would drag performance down anyway. Possibly even further? –  May 15 '12 at 03:10
  • You can create a small cache that contains only the ID. Just update the cache when you insert or remove a new record. Within PHP you just pull 20, and then use that to look up in DB (using `IN` operator) – Ja͢ck May 15 '12 at 03:11
  • Is it worth doing if the table is only eight columns wide, with six of those eight being tinyint? Also, at least two of said columns need to be used in WHERE statements. –  May 15 '12 at 03:20
  • @omgworm If you're already using a `WHERE` the number of rows is already reduced though ... – Ja͢ck May 15 '12 at 03:22
  • Thanks. That's what I thought too. Like I just answered to deceze though, I didn't want to turn into a database sadist by doing what's been noted as the worst performing solution on a site with possible several thousand users, but in this particular instance I think it's best for suiting the needs also. With the WHERE it is limited to a couple of hundred results at the most. :) –  May 15 '12 at 03:31