Dividing an array into two subsets of equal sizes having minimum difference in sum of values

Question

Given a set of n integers, divide the set in two subsets of n/2 sizes each such that the difference of the sum of two subsets is as minimum as possible. If n is even, then sizes of two subsets must be strictly n/2 and if n is odd, then size of one subset must be (n-1)/2 and size of other subset must be (n+1)/2.

For example, let given set be {3, 4, 5, -3, 100, 1, 89, 54, 23, 20}, the size of set is 10. Output for this set should be {4, 100, 1, 23, 20} and {3, 5, -3, 89, 54}. Both output subsets are of size 5 and sum of elements in both subsets is same (148 and 148). Another example where n is odd. Let given set be {23, 45, -34, 12, 0, 98, -99, 4, 189, -1, 4}. The output subsets should be {45, -34, 12, 98, -1} and {23, 0, -99, 4, 189, 4}. The sums of elements in two subsets are 120 and 121 respectively.

After much searching, I found out this problem is NP-Hard. Therefore, a polynomial time solution is not possible. However, I was thinking something in lines of this:

Initialise first subset as the first element.
Initialise second subset as second element.
Then depending upon which subset is smaller in size and the sum is lacking in which subset, I will insert the next elements.

The above might achieve a linear time, I guess.

However, the solution given here is way too complicated: http://www.geeksforgeeks.org/tug-of-war/. I couldn't understand it. Therefore, I just want to ask, is my solution correct? Considering the fact that this is a NP-Hard problem, I think it should do? And if not, can someone please explain in like really brief, how exactly the code on the link attached works? Thanks!

score 6 · Answer 1 · answered Jul 03 '15 at 20:20

6

Your solution is wrong.

It's a greedy approach to solve the Subset-Sum problem/ Partition Problem, that fails.

Here is a simple counter example:

arr = [1,2,3]

Your solution will assign A={1}, B={2}, and then chose to assign 3 to A, and get A={1,3}, B={2} - which is not optimal, since the optimal solution is A={1,2}, b={3}

The correct way to do it is using Dynamic Programming, by following the recursive formulas:

D(x,i) = false    i < 0
D(0,i) = true
D(x,i) = D(x,i-1) OR D(x-arr[i],i-1)

It can be done efficiently using Dynamic Programming by building a table that follows the recurrence bottom-up.

The table will be of size (SUM/2 + 1)*(n+1) (where SUM is the sum of all elements), and then find the maximal value in the table such that D(x,n) = true

answered Jul 03 '15 at 20:20

amit

175,853
27
231
333

Your example is correct. Got it. However, though your approach sounds correct, but I am still not able to understand it properly. Can you attach a bit more description about what `D(x,i)` is and what does true and false stand for? – John Lui Jul 04 '15 at 05:54
@JohnLui `D(x,i)` indicates you can get a subset of sum `x` using the elements from the first `i` elements. This means, `D(SUM/2,n)` means there is a subset of size `SUM/2` using any element in the list. – amit Jul 04 '15 at 08:09
@amit how does this ensure equal split? n/2 n/2? – CoderBC Jan 23 '19 at 00:41
@CoderBC To ensure equal split, you need to add a dimension to the recursive formula with number of elements chosen. It adds `*n` multiplier to the complexity, making it `O(SUM*n^2)` instead of `O(SUM*n)`. – amit Feb 24 '19 at 09:47

score -1 · Answer 2 · edited Aug 04 '19 at 14:00

The problem is a specialization of SubSet Sum problem which decides whether we can find any two partition that has equal sum. This is an NP-Complete problem.

But, the problem in question asking for 2 such equal partition where the equality holds when we satisfy following two conditions:

Size of the partitions differ by at most 1 Sum of the elements in the partitions is minimum Certainly we are asking here for a suboptimal solution to the more generalized NP-complete problem.

For example, for A=[1, 2, 3, 4, 5, 6, 7, 8, 9], we can have such two partitions {[1, 3, 2, 7, 9], [5, 4, 6, 8]} with sum diff = abs(22-23) = 1.

Our goal is to find suboptimal solution with best approximation ratio. Idea is to partition the array in pairs of element that would distribute the sum as uniformly as possible across the partitions. So, each time we would try to take 2 pairs and put one pair in a partition and the other pair in the other partition.

Sort the array If number of elements in less than 4 then create partitions accordingly for each cases when we have 1 element or 2 element or 3 element in the array. Otherwise we will take 2 pair each time and put into two partition such that it minimizes the sum diff. Pick the pair(largest, smallest) elemets in the the sorted array and put it to the smaller (wr.to sum) partition. Then pick the second largest element and find its buddy to put them in the ‘other’ partition such that sum of second largest and its buddy minimizes the sum difference of the partitions. The above approach will give a suboptimal solution. The problem in NP complete so, we can’t have an optimal solution but we can improve the approximation ratio as follows. If we have suboptimal solution (ie. sum diff != 0) then we try to improve the solution by swapping a large element in the larger partition with a small element in the smaller partition such that the swap actually minimizes the sum diff. The O(n^2) time and O(n) space implementation of the above approach is as follows –

//overall O(n^2) time and O(n) space solution using a greedy approach


----------


----------


----------



     public static ArrayList<Integer>[] findEqualPartitionMinSumDif(int A[]){
    //first sort the array - O(nlgn)
    Arrays.sort(A);
    ArrayList<Integer> partition1 = new ArrayList<Integer>();
    ArrayList<Integer> partition2 = new ArrayList<Integer>();

    //create index table to manage largest unused and smallest unused items
    //O(n) space and O(nlgn) time to build and query the set
    TreeSet<Integer> unused = new TreeSet<>();
    for(int i = 0; i<A.length; i++){
        unused.add(i);
    }

    int i = 0;
    int j = A.length-1;
    int part1Sum = 0;
    int part2Sum = 0;
    int diffSum = 0;

    //O(n^2) processing time
    while(unused.size() > 0){
        i = unused.first();
        j = unused.last();
        diffSum = part1Sum-part2Sum;

        //in case of size of the array is not multiple of 4 then we need to process last 3(or 2 or 1)
        //element to assign partition. This is special case handling
        if(unused.size() < 4){
            switch(unused.size()){
                case 1: 
                    //put the 1 remaining item into smaller partition
                    if(diffSum > 0){
                        partition2.add(A[i]);
                        part2Sum += A[i];
                    }
                    else{
                        partition1.add(A[i]);
                        part1Sum += A[i];
                    }
                break;
                case 2:
                    //among the remaining 2 put the max in smaller and min in larger bucket
                    int max = Math.max(A[i], A[j]);
                    int min = Math.min(A[i], A[j]);
                    if(diffSum > 0){
                        partition2.add(max);
                        partition1.add(min);
                        part2Sum += max;
                        part1Sum += min;
                    }
                    else{
                        partition1.add(max);
                        partition2.add(min);
                        part1Sum += max;
                        part2Sum += min;
                    }
                break;
                case 3:
                    //among the remaining 3 put the two having total value greater then the third one into smaller partition
                    //and the 3rd one to larger bucket 
                    unused.remove(i);
                    unused.remove(j);
                    int middle = unused.first();

                    if(diffSum > 0){
                        if(A[i]+A[middle] > A[j]){
                            partition2.add(A[i]);
                            partition2.add(A[middle]);
                            partition1.add(A[j]);
                            part2Sum += A[i]+A[middle];
                            part1Sum += A[j];
                        }
                        else{
                            partition2.add(A[j]);
                            partition1.add(A[i]);
                            partition1.add(A[middle]);
                            part1Sum += A[i]+A[middle];
                            part2Sum += A[j];
                        }
                    }
                    else{
                        if(A[i]+A[middle] > A[j]){
                            partition1.add(A[i]);
                            partition1.add(A[middle]);
                            partition2.add(A[j]);
                            part1Sum += A[i]+A[middle];
                            part2Sum += A[j];
                        }
                        else{
                            partition1.add(A[j]);
                            partition2.add(A[i]);
                            partition2.add(A[middle]);
                            part2Sum += A[i]+A[middle];
                            part1Sum += A[j];
                        }
                    }
                break;
                default:
            }

            diffSum = part1Sum-part2Sum;
            break;
        }

        //first take the largest and the smallest element to create a pair to be inserted into a partition
        //we do this for having a balanced distribute of the numbers in the partitions
        //add pair (i, j) to the smaller partition 
        int pairSum = A[i]+A[j];
        int partition = diffSum > 0 ? 2 : 1;
        if(partition == 1){
            partition1.add(A[i]);
            partition1.add(A[j]);
            part1Sum += pairSum;
        }
        else{
            partition2.add(A[i]);
            partition2.add(A[j]);
            part2Sum += pairSum;
        }

        //update diff
        diffSum = part1Sum-part2Sum;
        //we have used pair (i, j)
        unused.remove(i);
        unused.remove(j);
        //move j to next big element to the left
        j = unused.last();
        //now find the buddy for j to be paired with such that sum of them is as close as to pairSum
        //so we will find such buddy A[k], i<=k<j such that value of ((A[j]+A[k])-pairSum) is minimized.
        int buddyIndex = unused.first();
        int minPairSumDiff = Integer.MAX_VALUE;
        for(int k = buddyIndex; k<j; k++){
            if(!unused.contains(k))
                continue;

            int compPairSum = A[j]+A[k];
            int pairSumDiff = Math.abs(pairSum-compPairSum);

            if(pairSumDiff < minPairSumDiff){
                minPairSumDiff = pairSumDiff;
                buddyIndex = k;
            }
        }

        //we now find buddy for j. So we add pair (j,buddyIndex) to the other partition
        if(j != buddyIndex){
            pairSum = A[j]+A[buddyIndex];
            if(partition == 2){
                partition1.add(A[j]);
                partition1.add(A[buddyIndex]);
                part1Sum += pairSum;
            }
            else{
                partition2.add(A[j]);
                partition2.add(A[buddyIndex]);
                part2Sum += pairSum;
            }

            //we have used pair (j, buddyIndex)
            unused.remove(j);
            unused.remove(buddyIndex);
        }
    }

    //if diffsum is greater than zero then we can further try to optimize by swapping 
    //a larger elements in large partition with an small element in smaller partition
    //O(n^2) operation with O(n) space
    if(diffSum != 0){
        Collections.sort(partition1);
        Collections.sort(partition2);

        diffSum = part1Sum-part2Sum;
        ArrayList<Integer> largerPartition = (diffSum > 0) ? partition1 : partition2;
        ArrayList<Integer> smallerPartition = (diffSum > 0) ? partition2 : partition1;

        int prevDiff = Math.abs(diffSum);
        int largePartitonSwapCandidate = -1;
        int smallPartitonSwapCandidate = -1;
        //find one of the largest element from large partition and smallest from the smaller partition to swap 
        //such that it overall sum difference in the partitions are minimized
        for(i = 0; i < smallerPartition.size(); i++){
            for(j = largerPartition.size()-1; j>=0; j--){
                int largerVal = largerPartition.get(j);
                int smallerVal = smallerPartition.get(i);

                //no point of swapping larger value from smaller partition
                if(largerVal <= smallerVal){
                    continue;
                }

                //new difference if we had swapped these elements
                int diff = Math.abs(prevDiff - 2*Math.abs(largerVal - smallerVal));
                if(diff == 0){
                    largerPartition.set(j, smallerVal);
                    smallerPartition.set(i, largerVal);
                    return new ArrayList[]{largerPartition, smallerPartition};
                }
                //find the pair to swap that minimizes the sum diff
                else if (diff < prevDiff){
                    prevDiff = diff;
                    largePartitonSwapCandidate = j;
                    smallPartitonSwapCandidate = i;
                }
            }
        }

        //if we indeed found one such a pair then swap it. 
        if(largePartitonSwapCandidate >=0 && smallPartitonSwapCandidate >=0){
            int largerVal = largerPartition.get(largePartitonSwapCandidate);
            int smallerVal = smallerPartition.get(smallPartitonSwapCandidate);
            largerPartition.set(largePartitonSwapCandidate, smallerVal);
            smallerPartition.set(smallPartitonSwapCandidate, largerVal);
            return new ArrayList[]{largerPartition, smallerPartition};
        }
    }

    return new ArrayList[]{partition1, partition2};
}

Copied: http://www.zrzahid.com/equal-partitioning-with-minimum-difference-in-sum/ — Rajat Mishra, Aug 22 '21 at 17:55

Dividing an array into two subsets of equal sizes having minimum difference in sum of values

2 Answers2

Linked