I have a large database (MySQL, Aurora serverless) and I would like to get random rows (like 1 or 5) I know that using SORT BY RAND() is very slow, so that’s discarded.
I also know that here some tricks use the identifier of the row, but this is only working when the id is an integer autoincremented.
In my case, my database uses BINARY(16) as an identifier/primary key, and it is a randomly generated hash.
The thing is, what should I do to retrieve random rows for this configuration?
Note that in my case speed is more important than accuracy, so if it is not a perfectly random row, it is not a big issue.
Some ideas I have that I don’t know if they are good or bad:
-Every time I add a new row, I also add an extra column that uses RAND(), and I use that field to sort. Problem is, this will generate the same random rows again and again. Unless I update that field regularly. Seems too complex.
-Send 2 requests. The first one to get the oldest createdAt date. Then, the second one, sort it using a random date between the oldest one and now. This is not 100% accurate because creation dates are not distributed uniformly, but as I said, speed is more important than accuracy in my use case.
-Somehow, use my ids, because they are already random, perhaps I can sort starting from a random bit. No idea.
What do you think? Do you have more ideas? Thanks.