Given a set of positive integers and value X, find a subset S whose sum is >= X, such that sum(S) is the lowest of all sums of such existing subsets

Question

Given a set of positive integers and value X, find a subset S whose sum is >= X, such that sum(S) is the lowest of all sums of such existing subsets.

Can it be done in polynomial time? What would be the solution?

Checking all subsets is 2^n.

Are all the numbers positive ? It is difficult to insure a polynomial time in the general case. In practice, backtraking offers better performance than brute-force — Damien, Apr 18 '20 at 07:24
Does this answer your question? [Linear algorithm to find minimum subset sum over a threshold](https://stackoverflow.com/questions/17159827/linear-algorithm-to-find-minimum-subset-sum-over-a-threshold) — גלעד ברקן, Apr 18 '20 at 13:30
Note that the post you mentioned concerns bounded integers, float numbers here. @ גלעדברקן — Damien, Apr 18 '20 at 14:15
Actually, I forgot I can convert them to integers with no loss. — Greyshack, Apr 19 '20 at 14:42
@גלעדברקן that indeed answers this question But I have come to a conclusion that this https://stackoverflow.com/questions/61306597/given-a-collection-of-integers-and-threshold-value-t-divide-the-collection-into is the problem I want to solve. — Greyshack, Apr 19 '20 at 15:08

score 0 · Answer 1 · answered Apr 18 '20 at 10:50

Backtracking is a possibility for this problem.

It allows examining all the possibilities recursively, without the need of a large amount of memory.

It stops as soon as an optimal solution is found: sum = X, up to a given tolerance (for example 10^-10 in the programme below)

It allows to implement a simple procedure of premature abandon:
at a given time, if sum + the sum of all remaining elements is higher than X, then we can give up examining the current path, without examining the remaining elements. This procedure is optimized by sorting the input data in decreasing order

Here is a code, in C++. The code being quite basic, it should be easy to migrate it to another language.

This programme tests the algorithm with random (uniform) elements, and display the number of iterations.

The complexity (i.e. the number of iterations) is really varying with the random elements (of course), but also greatly depends of the tolerance that we accept. With a tolerance of 10^-10 and a size of n=100, the complexity generally stays quite acceptable. It is no longer the case with a smaller tolerance.

With n = 100 and five runs, I obtained for the number of iterations: 6102, 3672, 8479, 2235, 12926. However, it is clear that there is no warranty to have good performances in all cases. For n = 100, the number of candidates (subsets) is huge.

//  Find min sum greater than a given number X

#include    <iostream>
#include    <iomanip>
#include    <vector>
#include    <algorithm>
#include    <tuple>
#include    <cstdlib>
#include    <cmath>
#include    <ctime>

std::tuple<double, std::vector<double>> min_sum_greater(std::vector<double> &a, double X) {
    int n = a.size();
    std::vector<bool> parti (n, false);     // current partition studies
    std::vector<bool> parti_opt (n, false);  // optimal partition
    std::vector<double> sum_back (n, 0);   // sum of remaining elements

    //std::cout << "n = " << n << " \tX = " << X << "\n";
    std::sort(a.begin(), a.end(), std::greater<double>());

    sum_back[n-1] = a[n-1];
    for (int i = n-2; i >= 0; --i) {
        sum_back[i] = sum_back[i+1] + a[i];
    }
    double sum = 0.0;     // current sum

    int i = 0;          // index of the element being examined
    double best_sum = sum_back[0] + 1.0;
    bool up_down = true;
    double eps = 1.0e-10;       // error tolerance
    long long cpt = 0;      // to check the number of iterations

    while (true) {          // UP
        //std::cout << "Start of while loop: i = " << i << "\n";
        cpt++;
        if (up_down) {
            bool abandon = (sum + sum_back[i] < X - eps) || (sum > best_sum);
            if (abandon) {  //premature abandon
                parti[i] = false;
                up_down = false;
                i--;
                continue;
            }

            parti[i] = true;
            sum += a[i];
            //std::cout << "UP, i = " << i << " \tsum = " << sum << "\n";

            if (fabs(sum - X) < eps) {
                best_sum = sum;
                parti_opt = parti;
                break;
            }

            if (sum >= X) {
                if (sum < best_sum) {
                    best_sum = sum;
                    parti_opt = parti;
                    //std::cout << "i = " << i << " \tbest sum = " << best_sum << "\n";
                }
                parti[i] = false;
                sum -= a[i];
            }

            if (i == (n-1)) {           // leaf
                up_down = false;
                i--;
                continue;
            }
            i++;        

        } else {            // DOWN
            if (i < 0) break;
            if (parti[i]) {
                sum -= a[i];
                parti[i] = false;
                i++;
                up_down = true;
            } else {
                i--;
                up_down = false;
            }
        }
    }
    std::vector<double> answer;
    for (int i = 0; i < n; ++i) {
        if (parti_opt[i]) answer.push_back (a[i]);
    }
    std::cout << "number of iterations = " << cpt << " for n = " << n << "\n";
    return std::make_tuple (best_sum, answer);
}

int main () {
    //std::vector<double> a = {5, 6, 2, 10, 2, 3, 4, 13, 17, 38, 42};
    double X = 33.5;

    srand (time(NULL));
    int n = 100;
    double vmax = 100;
    X = vmax * n / 4;
    std::vector<double> a (n);
    for (int i = 0; i < n; ++i) {
        a[i] = vmax * double(rand())/RAND_MAX;
    }

    double sum;
    std::vector<double> y;
    std::tie (sum, y) = min_sum_greater (a, X);

    std::cout << std::setprecision(15) << "sum = " << sum << "\n";

    if (n < 20) {
        std::cout << "set: ";
        for (auto val: y) {
            std::cout << val << " ";
        }
        std::cout << "\n";
    }

}

Given a set of positive integers and value X, find a subset S whose sum is >= X, such that sum(S) is the lowest of all sums of such existing subsets

1 Answers1