Algorithm to recover a set given the sums of all its subsets

Question

There is a set A of positive/negative integers. We are given N numbers which are the sum of all its subsets. The task is to find the set A itself. Following is an example.

Input: 0 -2 4 5 2 3 9 7
Output: A = {-2 4 5}

For the case of positive integers, there is an O(n log n) solution suggested in this question. The algorithm

Sorts the input.

Takes the smallest element as a member of A.

Builds all possible subset sums so far (possible in O(n)) and removes them from input.

Repeats the above until the log n numbers of A are found.

But I have no idea how to deal with negative numbers since the smallest number isn't necessarily in set A (take A = {-2, -3} as an example). Any ideas on how to solve this version of problem with negative numbers? Preferably still in O(n log n).

I know a number of things to do which CAN increase efficiency. But I can't think of anything subexponential. — btilly, Apr 07 '23 at 21:48
If all the numbers are negative then the same n log n solution you use for all positive will still work. Several ways to do this, but maybe the easiest is to solve for the absolute values of the input, then slap a negative on everything in the solution. — Dave, Apr 07 '23 at 21:49

score 2 · Accepted Answer · answered Apr 07 '23 at 22:12

2

I just figured it out and Dave was on the right track.

The smallest value is the sum of the negative values. The differences between the values and the minimum are the sums of the absolute values of members of the set. You can find the absolute values in time O(n log(n)). Find any subset of the absolute values that sums to the absolute value of the minimum, reverse their signs, and you have an answer.

Note that there may be many valid answers.

answered Apr 07 '23 at 22:12

btilly

43,296
3
59
88

Could you give me a hint on how I can efficiently find all subsets that sum to the absolute value of the minimum? – fmatt Apr 09 '23 at 04:31
@fmatt https://stackoverflow.com/questions/4632322/finding-all-possible-combinations-of-numbers-to-reach-a-given-sum/64380474#64380474 shows how – btilly Apr 09 '23 at 05:39
Brilliant work! – גלעד ברקן Apr 09 '23 at 10:48

Matt Timmermans · Answer 2 · 2023-04-09T02:00:55.443

1

If you can find any member x of the set A then it's easy to separate all the subset sums into the ones that include x and the ones that don't. If x is positive, for example, then the smallest sum does not include x, and if you add x to it, you get the corresponding sum that does include x. The smallest remaining sum then does not include x, etc...

So the challenge is then just to isolate a member of A, and then repeat using the subset sums for the remaining members...

Note that any valid list of sums will contain 0, and the difference between the two smallest sums will be the same as the difference between the two largest sums, and that difference will be the smallest absolute value of a member of A.

To determine whether that smallest member is positive or negative, try separating the sums according to the above procedure and choose positive or negative according to which choice leaves a 0 in the remaining list.

Here's an implementation and a little test in python:

def getSubsetSums(lst):
    sums = [0]
    for x in lst:
        sums = sums + [x+y for y in sums]
    return sums

def getSetFromSubsetSums(sums):
    lst=[]
    sums = sorted(sums)
    while(len(sums) > 1):
        diff = sums[1] - sums[0]
        A=[]
        B=[]
        bmatched=0
        Ahas0 = False
        for x in sums:
            if bmatched < len(B) and x == B[bmatched]:
                bmatched += 1
                continue
            A.append(x)
            B.append(x+diff)
            if (x==0):
                Ahas0 = True
        if (Ahas0):
            sums = A
            lst.append(diff)
        else:
            sums = B
            lst.append(-diff)
    return lst


sums = getSubsetSums([9,1,-6,-4])
print(sums)
members = getSetFromSubsetSums(sums)
print(members)

sums = getSubsetSums([-2, 4, 5])
print(sums)
members = getSetFromSubsetSums(sums)
print(members)

[0, 9, 1, 10, -6, 3, -5, 4, -4, 5, -3, 6, -10, -1, -9, 0]
[1, -4, -6, 9]
[0, -2, 4, 2, 5, 3, 9, 7]
[-2, 4, 5]

edited Apr 09 '23 at 02:00

answered Apr 08 '23 at 02:51

Matt Timmermans

53,709
3
46
87

It works because failing to add a negative looks like adding its absolute value. More formally, let `N` be the negatives in the set and `P` be the positives. Let `X` be any subset of the set. Then `sum((N-(N∩X)) ∪ (P∩X))` is `sum(N) + sum({|x| for x in X})`. So the differences from the minimum are exactly the sums of subsets of absolute values. – btilly Apr 08 '23 at 03:51
@btilly I got that. When it comes time to select the negative members, though, I'm not sure that every subset with the proper sum results in a valid solution. – Matt Timmermans Apr 08 '23 at 04:05
It does as follows. Suppose that `S` is a subset with the right sum, and we flip signs to get our `N`. Suppose that we want to generate a subset sum `x`. By construction it is the minimum plus a sum of a subset `X'` of the absolute values. Let `Y' = S - X'` and then `Y` be the subset of `N` that is the negatives of `Y'`. Then `sum(Y ∪ (X-S)) = x`. This gives a one to one correspondence between sums of our generated set and the sums we were looking for. So the construction always works. – btilly Apr 08 '23 at 04:17
@MattTimmermans That's a cool approach and seems to have a straightforward implementation, thank you so much! – fmatt Apr 08 '23 at 04:33
1

BTW back at you. The sets `{-9, -1, 4, 6}` and `{-6, -4, 1, 9}` have exactly the same sums. How do you determine whether the smallest absolute value is positive or negative, and how to you show that you never make a choice which leads to a set that doesn't work? – btilly Apr 08 '23 at 04:50
@btilly, since those two sets have the same sums, it doesn't matter whether you choose 1 or -1, the list of sums will have *multiple zeros*, and both choices will work, leading to one answer or the other. – Matt Timmermans Apr 08 '23 at 18:04
@MattTimmermans I'm still not sure why ensuring that 0 is still on the list after you've picked is sufficient to ensure that you won't ever get stuck. – btilly Apr 08 '23 at 18:07
Every iteration partitions the sums into 2 halves, and deciding positive or negative just selects which half to keep. It's always possible to keep the one with 0 in it, until you get to [0,x]. Both halves are the same except for an offset, so your choice doesn't affect the following sequence of differences. – Matt Timmermans Apr 08 '23 at 18:14
@btilly yes, I see that your way works too, now – Matt Timmermans Apr 08 '23 at 18:18
How can we output all possible sets `A`? I recursively call the algorithm for each of the left/right arrays when both have zero. This works for the example `{-9, -1, 4, 6} and {-6, -4, 1, 9}` but produces duplicate sets when there is more than one zero in the input. – fmatt Apr 09 '23 at 04:25
To avoid duplicates while listing all sets, you can establish an ordering between members with the same absolute value. If you have absolute value `x` and both sets have zeros, you can make a recursive call to continue with `-x` and then loop to continue with `x`, but make sure that you never produce `x` in the recursive call. This ensures that all the `x`s come before all the `-x`s in every set produced. – Matt Timmermans Apr 09 '23 at 13:06
@MattTimmermans I'm able to come up with examples where the binary search will be exponential in the size of the set to find. However the input is also exponential so maybe that is OK. But if the input was a multiset, a list of pairs `(value, count)`, then the input is smaller, the set is the same, and then this recursion becomes slow. – btilly Apr 09 '23 at 17:37
@btilly the algorithm as I wrote it above is O(n) in the size of the input after sorting. If you adapt it to work with multisets, it's O(min(mn,2^m)) in the worst case, where m is the size of result set and n is the size of the input. There is no binary search. It's a deconvolution of the histogram. – Matt Timmermans Apr 10 '23 at 01:22

Algorithm to recover a set given the sums of all its subsets

2 Answers2