2

I have built an application that is used to simulate the number of products that a company can produce in different "modes" per month. This simulation is used to aid in finding the optimal series of modes to run in for a month to best meet the projected sales forecast for the month. This application has been working well, until recently when the plant was modified to run in additional modes. It is now possible to run in 16 modes. For a month with 22 work days this yields 9,364,199,760 possible combinations. This is up from 8 modes in the past that would have yielded a mere 1,560,780 possible combinations. The PC that runs this application is on the old side and cannot handle the number of calculations before an out of memory exception is thrown. In fact the entire application cannot support more than 15 modes because it uses integers to track the number of modes and it exceeds the upper limit for an integer. Baring that issue, I need to do what I can to reduce the memory utilization of the application and optimize this to run as efficiently as possible even if it cannot achieve the stated goal of 16 modes. I was considering writing the data to disk rather than storing the list in memory, but before I take on that overhead, I would like to get people’s opinion on the method to see if there is any room for optimization there.

EDIT Based on a suggestion by few to consider something more academic then merely calculating every possible answer, listed below is a brief explanation of how the optimal run (combination of modes) is chosen. Currently the computer determines every possible way that the plant can run for the number of work days that month. For example 3 Modes for a max of 2 work days would result in the combinations (where the number represents the mode chosen) of (1,1), (1,2), (1,3), (2,2), (2,3), (3,3) For each mode a product produces at a different rate of production, for example in mode 1, product x may produce at 50 units per hour where product y produces at 30 units per hour and product z produces at 0 units per hour. Each combination is then multiplied by work hours and production rates. The run that produces numbers that most closely match the forecasted value for each product for the month is chosen. However, because some months the plant does not meet the forecasted value for a product, the algorithm increases the priority of a product for the next month to ensure that at the end of the year the product has met the forecasted value. Since warehouse space is tight, it is important that products not overproduce too much either.

Thank you

private List<List<int>> _modeIterations = new List<List<int>>();

private void CalculateCombinations(int modes, int workDays, string combinationValues)
    {
        List<int> _tempList = new List<int>();

        if (modes == 1)
        {
            combinationValues += Convert.ToString(workDays);
            string[] _combinations = combinationValues.Split(',');

            foreach (string _number in _combinations)
            {
                _tempList.Add(Convert.ToInt32(_number));
            }
            _modeIterations.Add(_tempList);
        }
        else
        {
            for (int i = workDays + 1; --i >= 0; )
            {
                CalculateCombinations(modes - 1, workDays - i, combinationValues + i + ",");
            }
        }
    }
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Irwin M. Fletcher
  • 2,606
  • 3
  • 25
  • 29
  • 2
    Why are you generating them all at once? Couldn't you generate some of them, process those and throw them away? Then generate others, process those, etc. I'm also wondering what kind of algorithm needs to generate millions of combinations to figure out the best one by brute force. Surely there must be some smarter way of solving whatever problem this code is helping you solve? – Joren Sep 23 '09 at 13:28
  • I don't understand this question: you're saying that your factory has 22 work days, and on each work day it can operate in a different mode (with a different output). You want to find the combination of modes which gets as close to the projected total output for the month as possible? Can you switch mode on each day, or only once per week, or anything else? – Frerich Raabe Sep 23 '09 at 13:34
  • Also, how did you come up with the numbers 1,560,780 and 9,364,199,760? – Frerich Raabe Sep 23 '09 at 13:35
  • When there were only a few million to consider then letting the PC plug away for a minute or two brute forcing its way through didn't seem so bad, you are correct, this is no longer practical. Perhaps I could break up each month so that it only looks at a few million combinations at a time, which is a good suggestion. As for a new algorithm to calculate it with, if I could have done that in the first place, I wouldn’t be where I am now :). – Irwin M. Fletcher Sep 23 '09 at 13:37
  • Only one mode per day. (modes + days - 1)! / ((days)!(modes-1)! – Irwin M. Fletcher Sep 23 '09 at 13:38
  • Do you need to make use of all work days? If so, you'd have modes^^workdays possible combinations. I.e. 16^^22 = 309485009821345068724781056 combinations. :-} – Frerich Raabe Sep 23 '09 at 13:52
  • But if you have 22 days, and can choose any of 16 modes for any of those days, you would have 16^22 permutations. What rule didn't you tell us that you used to get to your formula? ;) – Joren Sep 23 '09 at 13:55
  • Joren, it doesn't matter what order the choices come in. In a given month, doing mode 1 for 11 days and mode 2 for 11 days gives the same results no matter whether it is 1, 2, 1, 2,... 1, 2 or 1,1,1,...2,2,2. You can thereby cut down enormously on the number of possibilities. (Leaving behind a still-enormous number, unfortunately!) – Eric Lippert Sep 23 '09 at 16:22

5 Answers5

10

This kind of optimization problem is difficult but extremely well-studied. You should probably read up in the literature on it rather than trying to re-invent the wheel. The keywords you want to look for are "operations research" and "combinatorial optimization problem".

It is well-known in the study of optimization problems that finding the optimal solution to a problem is almost always computationally infeasible as the problem grows large, as you have discovered for yourself. However, it is frequently the case that finding a solution guaranteed to be within a certain percentage of the optimal solution is feasible. You should probably concentrate on finding approximate solutions. After all, your sales targets are already just educated guesses, therefore finding the optimal solution is already going to be impossible; you haven't got complete information.)

What I would do is start by reading the wikipedia page on the Knapsack Problem:

http://en.wikipedia.org/wiki/Knapsack_problem

This is the problem of "I've got a whole bunch of items of different values and different weights, I can carry 50 pounds in my knapsack, what is the largest possible value I can carry while meeting my weight goal?"

This isn't exactly your problem, but clearly it is related -- you've got a certain amount of "value" to maximize, and a limited number of slots to pack that value into. If you can start to understand how people find near-optimal solutions to the knapsack problem, you can apply that to your specific problem.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • You are absolutely correct; I need to revisit this approach. I will read up in the areas that you have suggested. Perhaps a question about heuristics is in my future. – Irwin M. Fletcher Sep 23 '09 at 17:20
5

You could process the permutation as soon as you have generated it, instead of collecting them all in a list first:

public delegate void Processor(List<int> args);

private void CalculateCombinations(int modes, int workDays, string combinationValues, Processor processor)
{
    if (modes == 1)
    {
        List<int> _tempList = new List<int>();
        combinationValues += Convert.ToString(workDays);
        string[] _combinations = combinationValues.Split(',');

        foreach (string _number in _combinations)
        {
            _tempList.Add(Convert.ToInt32(_number));
        }
        processor.Invoke(_tempList);
    }
    else
    {
        for (int i = workDays + 1; --i >= 0; )
        {
            CalculateCombinations(modes - 1, workDays - i, combinationValues + i + ",", processor);
        }
    }
}

I am assuming here, that your current pattern of work is something along the lines

CalculateCombinations(initial_value_1, initial_value_2, initial_value_3);

foreach( List<int> list in _modeIterations ) {

    ... process the list ...

}

With the direct-process-approach, this would be

private void ProcessPermutation(List<int> args) 
{
    ... process ...
}

... somewhere else ...

CalculateCombinations(initial_value_1, initial_value_2, initial_value_3, ProcessPermutation);

I would also suggest, that you try to prune the search tree as early as possible; if you can already tell, that certain combinations of the arguments will never yield something, which can be processed, you should catch those already during generation, and avoid the recursion alltogether, if this is possible.

In new versions of C#, generation of the combinations using an iterator (?) function might be usable to retain the original structure of your code. I haven't really used this feature (yield) as of yet, so I cannot comment on it.

Dirk
  • 30,623
  • 8
  • 82
  • 102
  • +1, definitely: if it's possible to tell at the moment of generation whether the current combination is superior or inferior to another combination, you should do that and discard (or remove) the inferior combination before storing it. Or, if you want to end up with *X* combinations, just see if the current combination is inferior to the current top *X* you have stored. – Jeff Sternal Sep 23 '09 at 13:39
  • 1
    +1, This was a good solution to the question that I asked. It has indeed taken care of the memory issue that I was facing. However, I didn't mark it as the answer because as Eric and Jorge note below, this brute force solution will always have a limit. For the short term, this has allowed me to move on with the issue at hand, thank you for your time. – Irwin M. Fletcher Sep 23 '09 at 17:18
2

The problem lies more in the Brute Force approach that in the code itself. It's possible that brute force might be the only way to approach the problem but I doubt it. Chess, for example, is unresolvable by Brute Force but computers play at it quite well using heuristics to discard the less promising approaches and focusing on good ones. Maybe you should take a similar approach.

On the other hand we need to know how each "mode" is evaluated in order to suggest any heuristics. In your code you're only computing all possible combinations which, anyway, will not scale if the modes go up to 32... even if you store it on disk.

Jorge Córdoba
  • 51,063
  • 11
  • 80
  • 130
1
if (modes == 1)
{
    List<int> _tempList = new List<int>();
    combinationValues += Convert.ToString(workDays);
    string[] _combinations = combinationValues.Split(',');

    foreach (string _number in _combinations)
    {
        _tempList.Add(Convert.ToInt32(_number));
    }
    processor.Invoke(_tempList);
}

Everything in this block of code is executed over and over again, so no line in that code should make use of memory without freeing it. The most obvious place to avoid memory craziness is to write out combinationValues to disk as it is processed (i.e. use a FileStream, not a string). I think that in general, doing string concatenation the way you are doing here is bad, since every concatenation results in memory sadness. At least use a stringbuilder (See back to basics , which discusses the same issue in terms of C). There may be other places with issues, though. The simplest way to figure out why you are getting an out of memory error may be to use a memory profiler (Download Link from download.microsoft.com).

By the way, my tendency with code like this is to have a global List object that is Clear()ed rather than having a temporary one that is created over and over again.

Brian
  • 25,523
  • 18
  • 82
  • 173
0

I would replace the List objects with my own class that uses preallocated arrays to hold the ints. I'm not really sure about this right now, but I believe that each integer in a List is boxed, which means much more memory is used than with a simple array of ints.

Edit: On the other hand it seems I am mistaken: Which one is more efficient : List<int> or int[]

Community
  • 1
  • 1
rslite
  • 81,705
  • 4
  • 44
  • 47
  • 2
    Untrue. Generics in C# are decent enough not to box value types. You might be thinking of .NET 1.1 `ArrayList`, or you might be thinking of Java generics. (At least how they worked around 1.4.2) In both cases ints are indeed boxed. – Joren Sep 23 '09 at 13:30
  • Yep, if you read my edit to the post you can see I already admitted that. – rslite Sep 24 '09 at 19:57