The trick is to use a variation of shuffle or in other words a partial shuffle.
performance is not the only criterion, statistical efficiency, i.e unbiased sampling is as important (as the original shuffle
solution is)
function random_pick( $a, $n )
{
$N = count($a);
$n = min($n, $N);
$picked = array_fill(0, $n, 0); $backup = array_fill(0, $n, 0);
// partially shuffle the array, and generate unbiased selection simultaneously
// this is a variation on fisher-yates-knuth shuffle
for ($i=0; $i<$n; $i++) // O(n) times
{
$selected = mt_rand( 0, --$N ); // unbiased sampling N * N-1 * N-2 * .. * N-n+1
$value = $a[ $selected ];
$a[ $selected ] = $a[ $N ];
$a[ $N ] = $value;
$backup[ $i ] = $selected;
$picked[ $i ] = $value;
}
// restore partially shuffled input array from backup
// optional step, if needed it can be ignored, e.g $a is passed by value, hence copied
for ($i=$n-1; $i>=0; $i--) // O(n) times
{
$selected = $backup[ $i ];
$value = $a[ $N ];
$a[ $N ] = $a[ $selected ];
$a[ $selected ] = $value;
$N++;
}
return $picked;
}
NOTE the algorithm is strictly O(n)
in both time and space, produces unbiased selections (it is a partial unbiased shuffling) and produces output which is proper array with consecutive keys (not needing extra array_values
etc..)
Use example:
$randomly_picked = random_pick($my_array, 5);
// or if an associative array is used
$randomly_picked_keys = random_pick(array_keys($my_array), 5);
$randomly_picked = array_intersect_key($my_array, array_flip($randomly_picked_keys));
For further variations and extensions of shuffling for PHP:
- PHP - shuffle only part of an array
- PHP shuffle with seed
- How can I take n elements at random from a Perl array?