0

I have a large set of decimal numbers in an Excel file (for this test it is 1000 but could possibly be more or less). In my C# code I first read these into objects which hold their values as floats and their row numbers.

My problem is a knapsack kind of problem. I want to find all possible subsets of these positive and negative decimal numbers that add up to a given sum total. The only thing I have to restrict my problem is that the subset size must be greater than 1 (Though I would be open to implementing a solution which restricts the maximum length of a subset with a parameter by manually guessing what would be reasonable).

I am currently writing my code in C# but could change that approach if needed.

I know memory wise this problem is very hard to solve, at least with the approach I have had in mind. That approach was to first generate a List of all possible subsets out of the 1000 entries (no matter their sum), and then find all the subsets out of that list that have the correct sum total. This is though VERY time consuming as I believe the total amount of possible combinations for a 1000 entries is 2^1000 -1 which is a HUGE number and my computer ran for hours until eventually running out of memory (System.OutOfMemoryException) when trying to generate these combinations.

I have tried multiple solutions found on SO such as Find out which combinations of numbers in a set add up to a given total as well as purely using the Excel Solver Add-In. None of them have worked for my case though since the amount of values I have is so big.


EDIT: I would like to add that the numbers are in no particular order but if it would help my case I could possibly sort them once I have them all read into objects in a list.

Matheos
  • 207
  • 2
  • 12
  • You do understand that the knapsack-problem is np-complete? – Kris Vandermotten Aug 02 '19 at 12:19
  • Yes I do. But I am wondering if there is any possible way to improve on my "brute force" thinking? – Matheos Aug 02 '19 at 12:28
  • 2^1000 is approximately 10^301. Assuming you could test 1,000,000,000 sets per second, testing them all would take 10^284 years. The universe won't last that long. Also, being np-complete means that there is no more efficient solution than brute force. See https://en.wikipedia.org/wiki/NP-completeness and https://en.wikipedia.org/wiki/Subset_sum_problem – Kris Vandermotten Aug 02 '19 at 12:58

0 Answers0