I'm looking for an efficient way of randomly selecting 100 rows satisfying certain conditions from a MySQL table with potentially millions of rows.
Almost everything I've found suggests avoiding the use of ORDER BY RAND(), because of poor performance and scalability.
However, this article suggests ORDER BY RAND() may still be used as a "nice and fast way" to fetch randow data.
Based on this article, below is some example code showing what I'm trying to accomplish. My questions are:
Is this an efficient way of randomly selecting 100 (or up to several hundred) rows from a table with potentially millions of rows?
When will performance become an issue?
SELECT user.* FROM ( SELECT id FROM user WHERE is_active = 1 AND deleted = 0 AND expiretime > '.time().' AND id NOT IN (10, 13, 15) AND id NOT IN (20, 30, 50) AND id NOT IN (103, 140, 250) ORDER BY RAND() LIMIT 100 ) AS random_users STRAIGHT JOIN user ON user.id = random_users.id