If you know in advance the total number of items from which to randomly pick, you can do it without converting to a list first.
The following method will do it for you:
/// <summary>Randomly selects items from a sequence.</summary>
/// <typeparam name="T">The type of the items in the sequence.</typeparam>
/// <param name="sequence">The sequence from which to randomly select items.</param>
/// <param name="count">The number of items to randomly select from the sequence.</param>
/// <param name="sequenceLength">The number of items in the sequence among which to randomly select.</param>
/// <param name="rng">The random number generator to use.</param>
/// <returns>A sequence of randomly selected items.</returns>
/// <remarks>This is an O(N) algorithm (N is the sequence length).</remarks>
public static IEnumerable<T> RandomlySelectedItems<T>(IEnumerable<T> sequence, int count, int sequenceLength, System.Random rng)
{
if (sequence == null)
{
throw new ArgumentNullException("sequence");
}
if (count < 0 || count > sequenceLength)
{
throw new ArgumentOutOfRangeException("count", count, "count must be between 0 and sequenceLength");
}
if (rng == null)
{
throw new ArgumentNullException("rng");
}
int available = sequenceLength;
int remaining = count;
var iterator = sequence.GetEnumerator();
for (int current = 0; current < sequenceLength; ++current)
{
iterator.MoveNext();
if (rng.NextDouble() < remaining/(double)available)
{
yield return iterator.Current;
--remaining;
}
--available;
}
}
(The key thing here is needing to know at the start the number of items to choose from; this does reduce the utility somewhat. But if getting the count is quick and buffering all the items would take too much memory, this is a useful solution.)
Here's another approach which uses Reservoir sampling
This approach DOES NOT need to know the total number of items to choose from, but it does need to buffer the output. Of course, it also needs to enumerate the entire input collection.
Therefore this is really only of use when you don't know in advance the number of items to choose from (or the number of items to choose from is very large).
I would recommend just shuffling a list as per Henk's answer rather than doing it this way, but I'm including it here for the sake of interest:
// n is the number of items to randomly choose.
public static List<T> RandomlyChooseItems<T>(IEnumerable<T> items, int n, Random rng)
{
var result = new List<T>(n);
int index = 0;
foreach (var item in items)
{
if (index < n)
{
result.Add(item);
}
else
{
int r = rng.Next(0, index + 1);
if (r < n)
result[r] = item;
}
++index;
}
return result;
}
As an addendum to Henk's answer, here's a canonical implementation of the Shuffle algorithm he mentions. In this, _rng
is an instance of Random
:
/// <summary>Shuffles the specified array.</summary>
/// <typeparam name="T">The type of the array elements.</typeparam>
/// <param name="array">The array to shuffle.</param>
public void Shuffle<T>(IList<T> array)
{
for (int n = array.Count; n > 1;)
{
int k = _rng.Next(n);
--n;
T temp = array[n];
array[n] = array[k];
array[k] = temp;
}
}