1

How to pick exactly* half of the item of a list with equity**?

(*)Half is n/2, the int part of this is enought.

The number of item in the list is more than 10^9. So n/2-1 is not close enought to n to be a noticable approximation.

(**)The principe of equity and fairness mean that every items of the list have the same probality of beeing pick.

I had a Equity Picker like this:

 myList.Where(i => rand.NextDouble() >= 0.5);

but with this I can have more/less than half, with a slight chance of all /none of them.

Disclaimer: this is not an home work more a reflexion and modelisation around Thanos Random Picker. The ability to pick half of every population with equity.

xdtTransform
  • 1,986
  • 14
  • 34

1 Answers1

1

If you know the number of items in the sequence, then you can select count of them randomly and with equity using the following code:

public static IEnumerable<T> RandomlySelectedItems<T>(IEnumerable<T> sequence, int count, int sequenceLength, Random rng)
{
    int available = sequenceLength;
    int remaining = count;

    using (var iterator = sequence.GetEnumerator())
    {
        for (int current = 0; current < sequenceLength; ++current)
        {
            iterator.MoveNext();

            if (rng.NextDouble() < remaining / (double)available)
            {
                yield return iterator.Current;
                --remaining;
            }

            --available;
        }
    }
}

This has the advantage that it does not make a copy of the sequence, and it is an O(N) operation.

For your example, you would pass count as n/2.

Matthew Watson
  • 104,400
  • 10
  • 158
  • 276
  • I like the O(n), But the fact that you stop picking once the half is pick mean that element that was not iterate didn't had a chance of beeing pick, It's not a pick with 1/N-i, where i is the iterator. But still. It feel wrong I have to do the math do see the real probality of element – xdtTransform Jul 05 '18 at 14:04
  • @xdtTransform The maths on this does work out. The element at the end of the list will only not be iterated 1/2 of the time; the 1/2 of the time it is iterated, it will always be picked. – Rawling Jul 05 '18 at 14:12
  • @Rawling, the math look right. I can't ping you on dupe taging, but most of the answer on this question violate the fair equity. that is not explicity part of the question. – xdtTransform Jul 05 '18 at 14:21
  • 1
    @xdtTransform This algorithm is well known, and it does pick all elements with even distribution. – Matthew Watson Jul 05 '18 at 14:47
  • It is not completely mathematically even distribution because floating point numbers are imprecise - but it should be close enough for all practical purposes. An alternative would be to do a random shuffle on your array, and then take the first N elements (but that changes the order of your array). – Matthew Watson Jul 05 '18 at 14:52
  • @xdtTransform Say 6 elements. First is `3/6 = 0.5` chance of picking. Second is `0.5*2/5 + 0.5*3/5 = 0.5` chance. Third is `0.25*1/4 + 0.5*2/4 + 0.25*3/4 = 0.5` chance. It carries on like that. – Rawling Jul 06 '18 at 07:12