0

I have a pretty simple application that needs to (as fairly as possible) randomly assign a person to a team.

At the moment, I am iterating through with a few different methods.

Team.where(assigned: false).order("RANDOM()").first

as well as loading it into an array and using sample()

arr.sample().inspect

However, these don't appear to be truly random, they typically leave the edges (1,2..8,9 where count = 10) til last. Is there a better method that does or doesn't involve AR? Is there a mathematically noticeable difference between PSQL rand and sqlite3 random()?

Any assistance on loops to generate said random distribution are appreciated!

williamthomas
  • 105
  • 1
  • 10
  • Sample has always been fair to me - have you tried doing this a couple of hundred thousand times and counting how many times each object is picked? – Frederick Cheung Oct 30 '14 at 09:44
  • 1
    @williamthomas you can use offset to select random values check this link: http://stackoverflow.com/questions/2752231/random-record-in-activerecord – anusha Oct 30 '14 at 09:45
  • Yeah, it "seems" fine, much much better than straight from db. Not sure if caching is having any impact on it. – williamthomas Oct 30 '14 at 09:46
  • Humans tend to fail detect true randomness as such. While computer random algorithms are so-called pseudorandom algorithms, people usually mix up 'random' with 'even distribution' (even with very local scope). That is, if you toss a coin 100 times, it's perfectly reasonable to expect a few sequences where you get either heads or tails 5 times in a row. Actually NOT having such sequences is a strong indication of sequence not being truly random. I've never had issues with just rand(), though for security purposes I do use SecureRandom (better) or bcrypt (likely best for passwords). – EdvardM Oct 30 '14 at 10:16

1 Answers1

0

Although there may be more ways to generate random numbers to select a record (even extreme measures like random.org) the random selections you have provided should prove to be more than sufficient and with little or no difference between them.

If I were you I would however realise the difference in overhead between the two; one should select only one record from the database, the other pulls all the records into memory and picks one at random. If the Team table were to become large this could lead to a memory and calculation drain that would slow the whole calculation. The second method I would only use if I needed the whole table in memory for some other purpose as well.

fifi
  • 1