I have found few interesting things (for me at least) while generating random integers and I cannot explain it to myself so I thought I will post it here.
My needs were simple: I was generating random integral (Int32) IDs and was aiming to minimize collisions. The generating time wasn´t an issue.
I have tried these methods for generating random integers:
1.)
return rnd.Next();
where rnd is class field of type Random with Seed from Method #3.
2.)
return rnd.Next(Int32.MinValue, Int32.MaxValue);
where rnd is again class field of type Random with Seed from Method #3.
3.)
var buffer = new byte[sizeof(int)];
using (var rng = new RNGCryptoServiceProvider())
{
rng.GetBytes(buffer);
}
return BitConverter.ToInt32(buffer, 0);
Note: I also tried to have that RNGCryptoServiceProvider as class field initialized once on initialization of the containg class to ease work of GC, but I took same time to generate so I thought this would be "more random".
4.)
return new Random(Method3()).Next();
5.)
return new Random(Method3()).Next(Int32.MinValue, Int32.MaxValue);
I know, that creating new Random(int seed) on every call is time consuming, but there is less collision, right?
Well, now the mystery part. I was asuming that most collisions would have the method #1 and #2 where #1 would be slightly faster and more collisionless and least collisions would have the method #4 and #5 where #4 would be slightly faster and more collisionless and method #3 would be some sort of compromise.
So I made a test to prove my asumptions. I generated 10x (to make it average) one million of random numbers using every mentioned method and took the average number of collision and average time it took to generate one million numbers. The results (below) are a bit suprise to me.
The results:duration is hours:minutes:seconds:miliseconds format
Method1: AvgCollisions: 235, AvgDuration: 00:00:00.3561967
Method2: AvgCollisions: 116, AvgDuration: 00:00:00.4042033
Method3: AvgCollisions: 115, AvgDuration: 00:00:04.6037259
Method4: AvgCollisions: 234, AvgDuration: 00:00:09.2195856
Method5: AvgCollisions: 233, AvgDuration: 00:00:09.1788223
I ran the test few times again and it was always circa the same.
Isn't it strange to you, too? The times aren't suprise at all that is what I assumed, but the results mean to me, that method2 is the best to generate random numbers, as it is the most random, the fastest and you can set a minimal and maximal generated number. Don't know how much is the Method2 more predictable over the Method3 as I don't know how would I test it.
Can someone explain me what am I doning wrong or why methods #4 and #5 don't have the least collisions and why is the percentage of collisions always the same? Shouldn't it be random?
EDIT: Here is Visual Studio 2010 Solution of this test I've done: http://bit.ly/nxLODw