In the worst case, you do have to go through all 2^n
subsets of your dataset, but if all of your items are non-negative you can start by filtering on item.Number <= 500
.
Here is a possible Subsets
method (actually an answer to How to get all subsets of an array?, but don't tell them):
public static IEnumerable<IEnumerable<T>> Subsets(this IEnumerable<T> source)
{
var first = source.FirstOrDefault();
if (first == null) return new[] { Enumerable.Empty<T>() };
var others = source.Skip(1).Subsets();
return others.Concat(others.Select(s => s.Concat(new { first })));
}
Once you have your Subsets
method, then you can filter the result as follows, though performance is still of the order of a gazillion (or 2^n
if you want to be picky).
var sets = items.Where(i => i.Number <= 500)
.Subsets().Where(s => s.Sum(i => i.Number) == 500);
However, if you do have the constraint on Number
, that it is non-negative, you can combine the Subsets
operation with a search for a target sum. That would mean you would define
public static IEnumerable<IEnumerable<T>> SubsetsAddingUpTo(this IEnumerable<T> source, int target)
{
// This stopping condition ensures that you will not have to walk the rest of the tree when you have already hit or exceeded your target.
// It assumes that the Number values are all non-negative.
if (target < 0) return Enumerable.Empty<IEnumerable<T>>();
var first = source.FirstOrDefault();
if (first == null) return Enumerable.Empty<IEnumerable<T>>();
var tail = source.Skip(1).Where(i => i.Number <= target).ToList();
var othersIncludingFirst = tail.SubsetsAddingUpTo(target - first.Number);
var othersExcludingFirst = tail.SubsetsAddingUpTo(target);
return othersExcludingFirst.Concat(othersIncludingFirst.Select(s => s.Concat(new { first })));
}
Because the check for <= target
happens inside the method, you don't have to do any pre-filtering. However, you can perform a sort before you do the search, to give you your sets in a hierarchical date order. The call will be
var sets = items.OrderByDescending(i => i.Date).SubsetsAddingUpTo(500);
This should actually give you decent performance. The worst case (every item has a number of 0 or 1) won't be very good (order 2^n
), but if most of the values of Number
are of a similar order of magnitude to your target sum, as is the case in your example, then the stopping condition will fly in to your rescue and save you a large number of unnecessary operations.