This procedure is the most frequently accessed in this target application. Assume concurrent operations, and t.value is always changing.
-- Table is MySQL InnoDB
-- let's call this MainSelect
SELECT t.Id
FROM table t
WHERE t.A = conditionA AND t.B = conditionB AND t.value > 0
ORDER BY RAND()
LIMIT 1 INTO vIndex FOR UPDATE;
-- IF vIndex THEN
UPDATE table SET value = value - 1 WHERE id = vIndex
The goal is to modify this query to use a form of this method of random row selection for its speed. Here for completeness. This is the main question of this post.
SELECT name
FROM random AS r1 JOIN
(SELECT (RAND() *
(SELECT MAX(id)
FROM random)) AS id)
AS r2
WHERE r1.id >= r2.id
ORDER BY r1.id ASC
LIMIT 1
Discussion:
How would the total number of rows in MainSelect determined?
If the answer to this is to make the MainSelect subquery, moving the FOR UPDATE to the outer most query, the randomly selected row's t.value could become 0 before the outer SELECT locks the row with the FOR UPDATE. Something like:
SELECT * FROM (firstquery) s ...random selection logic.. FOR UPDATE;
If this consideration is accurate, this places the question on which transaction level should be set at the onset.
Thanks
Edit - Notes while working:
Maybe http://en.wikipedia.org/wiki/Reservoir_sampling, because the count is unknown. I'd like to avoid high isolation levels as I would expect lower throughput as a result.
Perhaps a random number could be stored and indexed, rather than computed. Then a random number is selected, and according to limit documentation, selecting one at random is very fast. The problem with this is the result set will not be uniform.
If you use LIMIT row_count with ORDER BY, MySQL ends the sorting as soon as it has found the first row_count rows of the sorted result, rather than sorting the entire result. If ordering is done by using an index, this is very fast.