2

Say I have a sorted list of 1000 or so unique decimals, arranged by value.

List<decimal> decList

How can I get a random x number of decimals from a list of unique decimals that total up to y?

private List<decimal> getWinningValues(int xNumberToGet, decimal yTotalValue)
{

}

Is there any way to avoid a long processing time on this? My idea so far is to take xNumberToGet random numbers from the pool. Something like (cool way to get random selection from a list)

foreach (decimal d in decList.OrderBy(x => randomInstance.Next())Take(xNumberToGet))
{

}

Then I might check the total of those, and if total is less, i might shift the numbers up (to the next available number) slowly. If the total is more, I might shift the numbers down. I'm still now sure how to implement or if there is a better design readily available. Any help would be much appreciated.

Kirill Bestemyanov
  • 11,946
  • 2
  • 24
  • 38
stormist
  • 5,709
  • 12
  • 46
  • 65
  • You can use [Fisher-Yates shuffle algorithm](http://stackoverflow.com/questions/9557883/random-plot-algorithm) – L.B Oct 12 '12 at 08:21
  • But how do I make sure the random numbers add up to a certain value? – stormist Oct 12 '12 at 08:39
  • It's a biggish job just to determine whether or not there exists a subset of size `xNumberToGet` whose total is `yTotalValue`. So I don't think you can avoid a longish processing time. – Steve Jessop Oct 12 '12 at 08:45
  • Are you picking numbers out of the pool as you go? Ie. if the target is 4, for a list of available decimals of `[1, 1.5, 2]`, would `[1, 1, 2]` satisfy it, or are the picked numbers required to be unique? – AKX Oct 12 '12 at 08:57
  • think Knapsack problem, do you wan the first solution or all solutions? – Jodrell Oct 12 '12 at 09:05
  • 2
    I think this could be tough http://xkcd.com/287/ – Jodrell Oct 12 '12 at 09:14
  • seems relavant http://stackoverflow.com/questions/403865/algorithm-to-sum-up-a-list-of-numbers-for-all-combinations – Jodrell Oct 12 '12 at 09:19
  • as does http://stackoverflow.com/questions/9072558/how-do-i-determine-if-any-combination-of-the-sum-of-a-set-of-values-is-equal-to – Jodrell Oct 12 '12 at 09:20
  • last of all https://en.wikipedia.org/wiki/Subset_sum_problem – Jodrell Oct 12 '12 at 09:21

2 Answers2

1

There are k such subsets of decList (k might be 0).

Assuming that you want to select each one with uniform probability 1/k, I think you basically need to do the following:

  1. iterate over all the matching subsets
  2. select one

Step 1 is potentially a big task, you can look into the various ways of solving the "subset sum problem" for a fixed subset size, and adapt them to generate each solution in turn.

Step 2 can be done either by making a list of all the solutions and choosing one or (if that might take too much memory) by using the clever streaming random selection algorithm.

If your data is likely to have lots of such subsets, then generating them all might be incredibly slow. In that case you might try to identify groups of them at a time. You'd have to know the size of the group without visiting its members one by one, then you can choose which group to use weighted by its size, then you've reduced the problem to selecting one of that group at random.

If you don't need to select with uniform probability then the problem might become easier. At the best case, if you don't care about the distribution at all then you can return the first subset-sum solution you find -- whether you'd call that "at random" is another matter...

Community
  • 1
  • 1
Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
1

Ok, start with a little extension I got from this answer,

public static IEnumerable<IEnumerable<T>> Combinations<T>(
    this IEnumerable<T> source,
    int k)
{
    if (k == 0)
    {
        return new[] { Enumerable.Empty<T>() };
    }

    return source.SelectMany((e, i) =>
        source.Skip(i + 1).Combinations(k - 1)
            .Select(c => (new[] { e }).Concat(c)));
}

this gives you a pretty efficient method to yield all the combinations with k members, without repetition, from a given IEnumerable. You could make good use of this in your implementation.

Bear in mind, if the IEnumerable and k are sufficiently large this could take some time, i.e. much longer than you have. So, I've modified your function to take a CancellationToken.

private static IEnumerable<decimal> GetWinningValues(
    IEnumerable<decimal> allValues,
    int numberToGet, 
    decimal targetValue,
    CancellationToken canceller)
{
    IList<decimal> currentBest = null;
    var currentBestGap = decimal.MaxValue;
    var locker = new object();

    allValues.Combinations(numberToGet)
        .AsParallel()
        .WithCancellation(canceller)
        .TakeWhile(c => currentBestGap != decimal.Zero)
        .ForAll(c =>
        {
            var gap = Math.Abs(c.Sum() - targetValue);
            if (gap < currentBestGap)
            {
                lock (locker)
                {
                    currentBestGap = gap;
                    currentBest = c.ToList();
                }
            }
        }

    return currentBest;
}

I've an idea that you could sort the initial list and quit iterating the combinations at a certain point, when the sum must exceed the target. After some consideration, its not trivial to identify that point and, the cost of checking may exceed the benefit. This benefit would have to be balanced agaist some function of the target value and mean of the set.

I still think further optimization is possible but I also think that this work has already been done and I'd just need to look it up in the right place.

Community
  • 1
  • 1
Jodrell
  • 34,946
  • 5
  • 87
  • 124