10

I have a value, say 20010. I want to randomly divide this value over 24 hours. So basically split the value into a 24 slot big array where all slots are randomly big.

What could be a good way solving this using C#?

Riri
  • 11,501
  • 14
  • 63
  • 88

11 Answers11

26

Draw 23 (not 24) numbers at random, (without duplicates), in the range 1 to 20009. Add 0 and 20010 the list and order these numbers, the difference between each two consecutive number gives you one slot value.

An online approach is also possible by drawing one value at a time and subtracting it from the "pot", drawing anew when the number is bigger than the amount left. This approach however may lead to a greater deviation of the size of the slots.

mjv
  • 73,152
  • 14
  • 113
  • 156
24

Here's a functional solution using mjv's algorithm:

static int[] GetSlots(int slots, int max)
{
    return new Random().Values(1, max)
                       .Take(slots - 1)
                       .Append(0, max)
                       .OrderBy(i => i)
                       .Pairwise((x, y) => y - x)
                       .ToArray();
}

public static IEnumerable<int> Values(this Random random, int minValue, int maxValue)
{
    while (true)
        yield return random.Next(minValue, maxValue);
}

public static IEnumerable<TResult> Pairwise<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TSource, TResult> resultSelector)
{
    TSource previous = default(TSource);

    using (var it = source.GetEnumerator())
    {
        if (it.MoveNext())
            previous = it.Current;

        while (it.MoveNext())
            yield return resultSelector(previous, previous = it.Current);
    }
}

public static IEnumerable<T> Append<T>(this IEnumerable<T> source, params T[] args)
{
    return source.Concat(args);
}
dahlbyk
  • 75,175
  • 8
  • 100
  • 122
  • In the eye of the beholder, I suppose. I rather like that the core implementation reads almost exactly as mjv describes it. – dahlbyk Oct 17 '09 at 05:23
  • actually - i think this an elegant straight forward solution.. clever, but not 'too' clever. – headsling Oct 17 '09 at 22:15
  • I don't mind something like this as an amusement. I even learned a thing or two about c# by reading it! But if someone brought me this in a code review, I would start pounding the Advil. For something matching my idea of elegance more closely, check out my python implementation below. – PeterAllenWebb Oct 19 '09 at 03:20
  • Out of curiosity, what in particular about it is Advil-inducing? In my view, Random.Values, Pairwise and Append are all extensions that could reasonably be applied to a number of situations. Any initial hit in readability ("What does Pairwise do?") is a price paid once that is repaid every time similar functionality would otherwise have to be implemented by hand. – dahlbyk Oct 19 '09 at 04:57
  • 1
    I agree with that. If this is a coding style my team had agreed to adopt, and we kept a library around to support it, I would fully support this solution. Otherwise, I would break out the Advil because we would have to decide whether a simple algorithm needs a solution involving iterators, generics, the yield keyword, etc... – PeterAllenWebb Oct 19 '09 at 14:46
  • I like this code. Good layering and easy to follow logic. I would simplify the Pairwise though and use Concat directly. – Andriy Volkov Oct 19 '09 at 20:15
  • I can see using Concat directly if I didn't already have Append defined. How would you simplify Pairwise? – dahlbyk Oct 19 '09 at 20:33
  • public static IEnumerable Pairwise(IEnumerable source, Func resultSelector) { T current = default(T); foreach (var i in source) yield return current = resultSelector(current, i); } – Andriy Volkov Oct 20 '09 at 13:56
  • I'm not sure we have the same understanding of Pairwise. I'm trying to map [1,2,3] to [resultSelector(1,2),resultSelector(2,3)]. Your implementation seems like more of a fold (aka Aggregate). – dahlbyk Oct 20 '09 at 14:24
  • isn't that what I do in mine?! anyhow my point was to use foreach and single generic type T just to make the code more readable – Andriy Volkov Oct 24 '09 at 00:48
  • Using foreach doesn't properly handle the first element in the sequence - the result sequence should be one shorter than the input sequence. The second generic type allows us to select out a different result type - an anonymous type with the paired values, for example. Not needed here, but useful elsewhere. – dahlbyk Oct 25 '09 at 06:45
  • For pairwise, how about: `bool haveValue = false; (...) foreach(...) { if(haveValue) yield return (...); else haveValue = true; (...)}` – Benjol Feb 09 '10 at 10:07
  • @Benjol You could certainly make that work, but the compiler is turning your foreach into a using+while loop anyway. Particularly for utility classes you write (and test) once then reuse, you might as well do it "right" even if it's a little trickier. – dahlbyk Feb 12 '10 at 03:34
11

Assuming that you don't want to have much (any) control over the distribution of sizes, here's an approach that would work (pseudo-code).

  1. Create a list of 24 random values, generated however you like, at whatever scale
  2. Find the sum of this list
  3. Create your final list by scaling the 24 random values against your total

Notes

  • If you use floating point arithmetic, you may be off by one or two. To avoid this, don't use scaling to complete the last value, instead fill it in with the total remaining.
  • If you do need tighter control over the distribution, use a different method to generate your initial array, but the rest doesn't need to change.
Bevan
  • 43,618
  • 10
  • 81
  • 133
  • +1 good solution mate (i wasn't sure this is the question actually) – Letterman Oct 17 '09 at 04:46
  • I am trying to understand this suggestion, but I dont understand what "by scaling the 24 random values against your total" means. Could anyone provide an example what that exactly means in code? – Max Oct 17 '09 at 07:44
  • 2
    double maxValue = 100; var values = new List { 1000, 2000, 3000, 4000, 5000 }; double scalingFactor = maxValue / values.Sum(); var scaledValues = values.Select(itm => itm * scalingFactor); var scaledValuesSum = scaledValues.Sum(); – Gregory Oct 17 '09 at 09:43
  • Thank you! Now I got it. Pretty neat way to do it. – Max Oct 17 '09 at 12:17
6

This is fun. Inspired by David, here's an implementation of mjv's solution using only LINQ-provided operators. Since David's Dictionary key is just an index, we can use an array instead for the Pairwise functionality:

var r = new Random();
var a = Enumerable.Repeat(null, n - 1)        // Seq with (n-1) elements...
                  .Select(x => r.Next(1, m))  // ...mapped to random values
                  .Concat(new [] { 0, m })
                  .OrderBy(x => x)
                  .ToArray();

return a.Skip(1).Select((x,i) => x - a[i]);
dahlbyk
  • 75,175
  • 8
  • 100
  • 122
4

I've calculated the average size of each of the 24 buckets over 100 trials for each of the algorithms proposed here. I thought it was interesting that three out of the four do seem to result in 20010/24 items per bucket on average, but the naive method I described converges to that average most quickly. This makes some intuitive sense to me. That method is something like snowing randomly on 24 buckets, and thus likely to result in buckets which are approximately equal in size. The others are more like hacking randomly at a length of wood.

Bevan:  [751, 845, 809, 750, 887, 886, 838, 868, 837, 902, 841, 812, 818, 774, 815, 857, 752, 815, 896, 872, 833, 864, 769, 894]
Gregory:  [9633, 5096, 2623, 1341, 766, 243, 159, 65, 21, 19, 16, 4, 3, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2]
mjv:  [895, 632, 884, 837, 799, 722, 853, 749, 915, 756, 814, 863, 842, 642, 820, 805, 659, 862, 742, 812, 768, 816, 721, 940]
peterallenwebb:  [832, 833, 835, 829, 833, 832, 837, 835, 833, 827, 833, 832, 834, 833, 836, 833, 838, 834, 834, 833, 834, 832, 836, 830]

And here is the python code: import random

N = 20010;

def mjv():
    gaps = [ random.randrange(0, N) for i in range(0, 24) ]
    gaps = gaps + [0, N]
    gaps.sort()
    value = [ gaps[i+1] - gaps[i] for i in range(0, 24) ]
    return value

def gregory():
    values = []
    remainingPortion = N

    for i in range(0, 23):
        val = random.randrange(1, remainingPortion - (23 - i))
        remainingPortion = remainingPortion - val
        values.append(val)

    values.append(remainingPortion)

    return values


def peterallenwebb():
    values = [0  for i in range(0, 24) ]

    for i in range(0, N):
        k = random.randrange(0, 24)
        values[k] = values[k] + 1 

    return values

def bevan():
    values = [];
    sum = 0.0

    for i in range(0, 24):
        k = random.random()
        sum = sum + k
        values.append(k);

    scaleFactor = N / sum

    for j in range(0, 24):
        values[j] = int(values[j] * scaleFactor)

    return values


def averageBucketSizes(method):
    totals = [0 for i in range(0, 24)]
    trials = 100

    for i in range(0,trials):
        values = method()

        for j in range(0, 24):
            totals[j] = totals[j] + values[j]      

    for j in range(0, 24):
        totals[j] = totals[j] / trials

    return totals;


print 'Bevan: ', averageBucketSizes(bevan)
print 'Gregory: ', averageBucketSizes(gregory)
print 'mjv: ', averageBucketSizes(mjv)
print 'peterallenwebb: ', averageBucketSizes(peterallenwebb)

Let me know if you see any mistakes. I will re-run.

PeterAllenWebb
  • 10,319
  • 3
  • 37
  • 44
  • This is barely readable. This code reminds me programs I wrote in high-school. – Andriy Volkov Oct 19 '09 at 20:12
  • Sorry about the lack of horizontal white-space. This is only my second python program. I've added white-space now that I realize that python doesn't mind. Still I don't think it makes that big a difference. How would you write mjv's alg in python, for instance? – PeterAllenWebb Oct 19 '09 at 23:57
  • @zkolkov might have meant that it is unidiomatic. For example, bevan is more pythonic as: `def bevan(N, sample_size): values = [random.random() for _ in range(sample_size)] scaleFactor = N / sum(values) return [int(value * scaleFactor) for value in values] ` mjv: `def mjv(N, sample_size): gaps = [0] + sorted(random.randrange(0, N) for _ in range(sample_size-1)) + [N] return [gaps[i+1] - gaps[i] for i in range(sample_size)] `or `def mjv(N, size): gaps = sorted(random.randrange(0, N) for _ in range(size-1)) return [y - x for x, y in zip([0] + gaps, gaps + [N])]` – hughdbrown Nov 09 '11 at 04:21
3

If you want to be sure that you're not biasing the process without much analysis, you could just create a 24 element array, initialize each element to 0 and then add 1 to one of the elements at random 20010 times.

It all depends on the kind of distributions you want to see, but I don't think any of the other techniques recommend so far will result in the hour-long "buckets" being statistically indistinguishable.

PeterAllenWebb
  • 10,319
  • 3
  • 37
  • 44
2

Another option would be to generate a random number between 0 and the target number. Then, add each "piece" to a list. Choose the largest "piece" and cut it in two, using another random number. Select the largest from the list (now with three pieces), and continue until you have the desired number of pieces.

List<int> list = new List<int>();
list.Add(2010);
Random random = new Random();
while (list.Count() < 24)
{
    var largest = list.Max();
    var newPiece = random.Next(largest - 1);
    list.Remove(largest);
    list.Add(newPiece);
    list.Add(largest - newPiece);
}
John Fisher
  • 22,355
  • 2
  • 39
  • 64
2

Here's another solution that I think would work very well for this. Each time the method is called, it will return another set of randomly distributed values.

public static IEnumerable<int> Split(int n, int m)
{
    Random r = new Random();
    int i = 0;

    var dict = Enumerable.Range(1, m - 1)
        .Select(x => new { Key = r.NextDouble(), Value = x })
        .OrderBy(x => x.Key)
        .Take(n - 2)
        .Select(x => x.Value)
        .Union(new[] { 0, m })
        .OrderBy(x => x)
        .ToDictionary(x => i++);

    return dict.Skip(1).Select(x => x.Value - dict[x.Key - 1]);
}
David Morton
  • 16,338
  • 3
  • 63
  • 73
  • Clever use of a dictionary to mimic Pairwise! I'd probably stick with r.Next(1,m-1) for randomizing the intervals - shuffling the sequence of all 20010 items adds quite a bit of overhead. Also, Concat is preferable to Union if you don't need the actual set operation - Union has extra logic to eliminate duplicates, which can't happen here. – dahlbyk Oct 18 '09 at 02:51
1

This will give you a somewhat "decreasing" randomness the higher the index becomes. You can randomise the list positions if required? It depends on what you need to do with it.

int initialValue = 20010;
var values = new List<int>();

Random rnd = new Random();
int currentRemainder = initialValue;

for (int i = 0; i < 21; i++)
{
    //get a new value;
    int val = rnd.Next(1, currentRemainder - (21 - i));

    currentRemainder -= val;
    values.Add(val);
}

values.Add(currentRemainder);

//initialValue == values.Sum()
Tyler
  • 17,669
  • 10
  • 51
  • 89
Gregory
  • 501
  • 6
  • 13
  • 1
    You might consider renaming currentRemainder to something like remainingPortion. Using remainder might get a little confusing, because there is no division involved. Just a thought. – Rob Lund Oct 17 '09 at 04:43
  • There is a question of distribution. Is it acceptable when the first random number is 20010 - 21, that every "random" number afterward is a "1"? – John Fisher Oct 17 '09 at 05:28
  • Agreed. Off the top of my head (zero testing/thinking applied) you could possibly modify this to use currentRemainder / 2 to somewhat flatten out the "curve" in the probability. – Gregory Oct 17 '09 at 06:58
1
class Numeric
  def n_rands(n)
    raw = (1..n).map { |x| rand } 
    raw.map { |x| x * to_f / raw.sum.to_f }.map { |x| x.to_i }.tap do |scaled|
      scaled[-1] = self - scaled[0..-2].sum
    end
  end
end

puts 1000.n_rands(10).inspect # [22, 70, 180, 192, 4, 121, 102, 179, 118, 12]
Mike H
  • 509
  • 3
  • 2
1

I tried the solutions from David and dahlbyk but had no luck. So here is what I came up with after reading the answer from mjv:

public static class IntExtensions
{
    public static IEnumerable<int> Split(this int number, int parts)
    {
        var slots = Enumerable.Repeat(0, parts).ToList();
        var random = new Random();

        while (number > 0)
        {
            var slot = random.Next(0, parts);
            slots[slot]++;
            number--;
        }
        return slots;
    }
}
Christoph
  • 26,519
  • 28
  • 95
  • 133