12

In Java, how should I find the closest (or equal) possible sum of an Array's elements to a particular value K?

For example, for the array {19,23,41,5,40,36} and K=44, the closest possible sum is 23+19=42. I've been struggling on this for hours; I know almost nothing about dynamic programming. By the way, the array contains only positive numbers.

TRiG
  • 10,148
  • 7
  • 57
  • 107
Karim El Sheikh
  • 339
  • 1
  • 2
  • 11
  • This is a fair question, but probably more suited here: http://math.stackexchange.com/ – Magnilex Apr 15 '13 at 18:29
  • 3
    Wouldn't 5 + 40 be the closest possible sum? – splungebob Apr 15 '13 at 18:31
  • I meant that the sum would be under or equal to K – Karim El Sheikh Apr 15 '13 at 18:32
  • 1
    Is there a limit of numbers you can add up? For example, can you only use two elements per sum, or could you also use the sum over all elements? – G. Bach Apr 15 '13 at 18:38
  • The sum over any number of elements – Karim El Sheikh Apr 15 '13 at 18:43
  • 6
    http://en.wikipedia.org/wiki/Knapsack_problem – arynaq Apr 15 '13 at 18:54
  • get all the possible subsets of your array as in the following: http://stackoverflow.com/questions/4640034/calculating-all-of-the-subsets-of-a-set-of-numbers and they loop over the sets and choose the set with the maximum sum that is below or equal to k. not optimal algorithm but it will work if that is not an issue here. –  Apr 15 '13 at 19:30
  • Calculating all subsets may literally take for years, depending on the size of his array. – G. Bach Apr 15 '13 at 19:33
  • i mentioned if it is not an issue here. so definately it will depend on his input –  Apr 15 '13 at 19:35
  • @A.J. You mean something like the solurion I provided below? G.Back, there are problems that take years, unfortunately, maybe this belongs to this category. – zafeiris.m Apr 15 '13 at 19:39
  • @zafeiris.m yes looks similar to your solution. –  Apr 15 '13 at 19:47

4 Answers4

16

You would typically use dynamic programming for such a problem. However, that essentially boils down to keeping a set of possible sums and adding the input values one by one, as in the following code, and has the same asymptotic running time: O(n K), where n is the size of your input array and K is the target value.

The constants in the version below are probably bigger, however, but I think the code is much easier to follow, than the dynamic programming version would be.

public class Test {
    public static void main(String[] args) {
        int K = 44;
        List<Integer> inputs = Arrays.asList(19,23,41,5,40,36);

        int opt = 0;                // optimal solution so far          

        Set<Integer> sums = new HashSet<>();
        sums.add(opt);

        // loop over all input values
        for (Integer input : inputs) {
            Set<Integer> newSums = new HashSet<>();

            // loop over all sums so far                        
            for (Integer sum : sums) {
                int newSum = sum + input;

                // ignore too big sums
                if (newSum <= K) {
                    newSums.add(newSum);

                    // update optimum                       
                    if (newSum > opt) {
                        opt = newSum;
                    }
                }
            }

            sums.addAll(newSums);
        }

        System.out.println(opt);
    }
}

EDIT

A short note on running time might be useful, since I just claimed O(n K) without justification.

Clearly, initialization and printing the result just takes constant time, so we should analyse the double loop.

The outer loop runs over all inputs, so it's body is executed n times.

The inner loop runs over all sums so far, which could be an exponential number in theory. However, we use an upper bound of K, so all values in sums are in the range [0, K]. Since sums is a set, it contains at most K+1 elements.

All computations inside the inner loop take constant time, so the total loop takes O(K). The set newSums also contains at most K+1 elements, for the same reason, so the addAll in the end takes O(K) as well.

Wrapping up: the outer loop is executed n times. The loop body takes O(K). Therefore, the algorithm runs in O(n K).

EDIT 2

Upon request on how to also find the elements that lead to the optimal sum:

Instead of keeping track of a single integer - the sum of the sublist - you should also keep track of the sublist itself. This is relatively straightforward if you create a new type (no getters/setters to keep the example concise):

public class SubList {
    public int size;
    public List<Integer> subList;

    public SubList() {
        this(0, new ArrayList<>());
    }

    public SubList(int size, List<Integer> subList) {
        this.size = size;
        this.subList = subList;
    }
}

The initialization now becomes:

SubList opt = new SubList();

Set<SubList> sums = new HashSet<>();
sums.add(opt);  

The inner loop over the sums needs some small adaptations as well:

for (Integer input : inputs) {
    Set<SubList> newSums = new HashSet<>();

    // loop over all sums so far                        
    for (SubList sum : sums) {
        List<Integer> newSubList = new ArrayList<>(sum.subList);
        newSubList.add(input);
        SubList newSum = new SubList(sum.size + input, newSubList);         

        // ignore too big sums
        if (newSum.size <= K) {
            newSums.add(newSum);

            // update optimum                       
            if (newSum.size > opt) {
                opt = newSum;
            }
        }
    }

    sums.addAll(newSums);
}
Vincent van der Weele
  • 12,927
  • 1
  • 33
  • 61
  • I read your solution and try to understand, maybe my question is stupid, but I can see where you *add* a number in the final solution, what I don't see is where you exclude a number of the final solution. I would expect that, at some point, you should realise that a included number should be excluded to lead to a better solution, isn't it? – zafeiris.m Apr 15 '13 at 19:27
  • @zafeiris.m I only keep track of the sums, not of the values they are composed off. Basically, after iteration `i` of the outer loop (over `inputs`), the set `sums` contains *all* possible sums that can be made with the first `i` input values. To make sure that the size of that set doesn't grow exponentially, I added `upperBound`, which makes sure `sums` contains at most `2*K` values. – Vincent van der Weele Apr 15 '13 at 19:32
  • but what if the best solution does not include the first `i` elements, but, say, the 1st, the 4th, and the 6th? – zafeiris.m Apr 15 '13 at 19:41
  • This is good but I want the sum to never be larger than K. Getting the closest sum that's smaller or equal to K. – Karim El Sheikh Apr 15 '13 at 19:43
  • @zafeiris.m ok, let's run a small example... assume `inputs={3,2,4}` and `K=8`. Clearly, 7 is the optimal solution, by picking the 1st and the 3rd input. We start with `sums={0}`. In the first iteration of the outer loop, we pick the 1st input (3), add it to all sums so far, and add the result back in `sums`, hence `sums={0, 3}`. In the second iteration, we do the same for 2, hence `sums={0,2,3,5}`. Finally, we add 4 and get `sums={0, 2, 3, 4, 5, 6, 7, 9}`. The best solution, 7, is in the set, even though it 'skipped' the 2nd input. – Vincent van der Weele Apr 15 '13 at 20:35
  • 1
    @Heuster Thank you, I can see it now, nice solution! But what I don't understand is the worst case complexity. In the example of your comment (and in any example that most of the elements will be included in the result), I see the result set gets doubled everytime, resulting in **exponential** complexity, isn't it? Tell me if I'm wrong. *Update*: Also space complexity is exponential since you keep each sub-result, right? – zafeiris.m Apr 15 '13 at 21:04
  • 1
    @zafeiris.m I added a note on running time. The trick is in `upperBound`, which keeps the size of `sums` linear in `K`. – Vincent van der Weele Apr 16 '13 at 06:05
  • @Heuster got it now, very nice, Thanks! – zafeiris.m Apr 16 '13 at 09:37
  • @zafeiris.m There likely is not a polynomial-time algorithm since this is an NP-Complete problem. – Paul Apr 24 '13 at 03:25
  • @Paulpro well, it's weakly NP-complete, so as long as `K` is polynomial in `n`, the algorithm will run in polynomial time. – Vincent van der Weele Apr 24 '13 at 06:15
  • @Paulpro absolutely! But on the other hand, in this implementation, `K` is an integer, so it's size is at most `MAX_INT`, which is a (rather big) constant, so the running time is just `O(n)` ;-) – Vincent van der Weele Apr 24 '13 at 06:35
  • isn't this just a bruteforce or kind of DP ? sorry , i don't understand java . – Aseem Goyal Feb 17 '14 at 12:59
  • @anon A brute force would try all `2^n` sums, whereas this algorithm runs in `O(n K)` time, so no, this is no brute force. – Vincent van der Weele Feb 17 '14 at 14:10
  • @Heuster: another question : the required sequence cannot be generated . Am i right ? if No , kindly explain that approach . – Aseem Goyal Feb 17 '14 at 14:14
  • @anon I updated the answer with a witnessing approach: whenever you update the optimum, you also keep track of the sublist that led to that optimum. – Vincent van der Weele Feb 17 '14 at 15:00
  • How can this code compile? The adaptation code is using sum and sums in both inner and outer loop. Does it require an update? I am trying to adapt the subList example, but not working as it says "Duplicate Variable" "sum" and "sums" – Yogendra Dec 10 '16 at 03:44
  • @YogendraJ there was clearly a typo in the second example, this is fixed now. Thanks for pointing it out! – Vincent van der Weele Dec 10 '16 at 09:44
  • As I am more comfortable with JavaScript, I have converted this here https://jsfiddle.net/thatOneGuy/jf1ucyxg/ Thanks for the solution – thatOneGuy Feb 20 '20 at 10:41
2

You can see it as a n-choose-k problem for all possible k so the complexity is exponential.

  1. Find a set of numbers that sum at most up to K. The set should include i numbers, for i=1; i<=N; i++. To implement this, for each i just take all the n-choose-i combinations of the numbers in the array.
  2. Keep a finalResult variable with the best set of numbers found so far and their sum.
  3. Compare each sub-result of step 1 with the finalResult and update it if necessary.

It reminds me of the Knapsack problem, so you may want to take a look at it.

zafeiris.m
  • 4,339
  • 5
  • 28
  • 41
1
private int closestSum(int[] a, int num){
    int k=a.length-1;
    int sum=0;
    Arrays.sort(a);

    while(a[k]>num){
        k--;
    }
    for(int i=0;i<k;i++){
        for(int j=i+1;j<=k;j++){
            if(a[i]+a[j]<=num && a[i]+a[j]>sum)
                sum = a[i]+a[j];
        }
    }
    return sum;
}
Nilamber Singh
  • 804
  • 1
  • 14
  • 33
0

I would say first sort the array. Then your example would be: arr = {5, 19, 23, 36, 40, 41}. Then: 1) Take arr[0] and arr[i], where i = arr.Size. Sum it and record the difference between the sum and K, if the sum is lower than K. 2) If sum > K, do step 1, but instead of arr[i], use arr[i-1], because we want to lower our sum. If sum < K, do step 1, but instead of arr[0], use arr[1], because we want to increase our sum. Keep repeating step2, by either increasing or decreasing the indices, until the indices for the two elements are equal. Then, we know the pair of elements that result in the smallest difference between the sum and K.

----------------Edited for arbitrary number of elements in the solution----------------

I believe you may need a tree. Here's what I'm thinking:

1) Choose a number as your top node.

2) For each number in the set, create a child node, and for each branch that was created, calculate the sum of that branch.

3) If the sum is less then K, we branch again, creating child nodes for all the elements in the set. If the sum is greater than K, we stop, keep the difference between the sum and K(If sum < K). If we find a branch with a better sum, then we keep that branch. Repeat this process until all branches are done branching.

Do steps 1-3 with different topnodes.

masotann
  • 901
  • 10
  • 29