Finding the smallest possible number which cannot be represented as sum of 1,2 or other numbers in the sequence

Question

I am a newbie in C++ and need logical help in the following task.

Given a sequence of n positive integers (n < 10^6; each given integer is less than 10^6), write a program to find the smallest positive integer, which cannot be expressed as a sum of 1, 2, or more items of the given sequence (i.e. each item could be taken 0 or 1 times). Examples: input: 2 3 4, output: 1; input: 1 2 6, output: 4

I cannot seem to construct the logic out of it, why the last output is 4 and how to implement it in C++, any help is greatly appreciated. Here is my code so far:

    #include<iostream>
    using namespace std;

    const int SIZE = 3;

    int main()
    {
//Lowest integer by default
int IntLowest = 1;
int x = 0;
//Our sequence numbers
int seq;
int sum = 0;
int buffer[SIZE];
//Loop through array inputting sequence numbers
for (int i = 0; i < SIZE; i++)
{
    cout << "Input sequence number: ";
    cin >> seq;
    buffer[i] = seq;
    sum += buffer[i];
}
int UpperBound = sum + 1;

int a = buffer[x] + buffer[x + 1];
int b = buffer[x] + buffer[x + 2];
int c = buffer[x + 1] + buffer[x + 2];
int d = buffer[x] + buffer[x + 1] + buffer[x + 2];


for (int y = IntLowest - 1; y < UpperBound; y++)
{
    //How should I proceed from here?

}
return 0;
    }

4 can't be written as a sum of numbers picked (at most once) from the set {1,2,6}, while 1, 2, and 3 can. Thus, 4 is the answer. You probably want to understand the problem and work out an algorithm away from the keyboard first, before you start thinking about implementing anything. — molbdnilo, Mar 18 '14 at 14:53
So following the logic 2 3 4. we take the 1st number is the sequence, which is 2, 2 is 1 + 1 - therefore we cannot take it. 3 is 1 + 2 - we cannot take it either, 4 is 1 + 3 so we cannot take it and the answer will be 1 then? Am i right? — BAT, Mar 18 '14 at 14:57
@BatyrAtamamedov No, that is not right. The reason that 2 isn't valid isn't because because 2 = 1 + 1 (1 doesn't appear in the sequence so that wouldn't invalidate it, and in any case, even if 1 did appear in the sequence, you cannot use the same number more than once in your sum). It's invalid because 2 = 2 (which is in the sequence). The answer is 1 because you start with the lowest integer (and 1 can't be expressed as the sum of any numbers in the sequence). Your other mistake is that you are starting with the sequence itself instead of the lowest integer. — JBentley, Mar 18 '14 at 15:08
I edited my code above I have a problem with writing the check statement inside of my loop. — BAT, Mar 18 '14 at 16:04

Natalya Ginzburg · Answer 1 · 2014-03-18T16:49:21.133

What the answer of Voreno suggests is in fact solving 0-1 knapsack problem (http://en.wikipedia.org/wiki/Knapsack_problem#0.2F1_Knapsack_Problem). If you follow the link you can read how it can be done without constructing all subsets of initial set (there are too much of them, 2^n). And it would work if the constraints were a bit smaller, like 10^3.

But with n = 10^6 it still requires too much time and space. But there is no need to solve knapsack problem - we just need to find first number we can't get.

The better solution would be to sort the numbers and then iterate through them once, finding for each prefix of your array a number x, such that with that prefix you can get all numbers in interval [1..x]. The minimal number that we cannot get at this point is x + 1. When you consider the next number a[i] you have two options:

a[i] <= x + 1, then you can get all numbers up to x + a[i],
a[i] > x + 1, then you cannot get x + 1 and you have your answer.

Example:

you are given numbers 1, 4, 12, 2, 3.

You sort them (and get 1, 2, 3, 4, 12), start with x = 0, consider each element and update x the following way:

1 <= x + 1, so x = 0 + 1 = 1.

2 <= x + 1, so x = 1 + 2 = 3.

3 <= x + 1, so x = 3 + 3 = 6.

4 <= x + 1, so x = 6 + 4 = 10.

12 > x + 1, so we have found the answer and it is x + 1 = 11.

(Edit: fixed off-by-one error, added example.)

Thank you @Natalya, so I sorted my array using Bubble sort and now I need another for loop to check the statement that you suggested. — BAT, Mar 18 '14 at 17:25
Bubble sort will only work for a small array, it has O(n^2) complexity. It will work to check the algorithm on some small inputs, but won't work for big cases. I'd suggest stl sort, it's O(nlogn), and rather easy to use: "std::sort(a, a + n);" — Natalya Ginzburg, Mar 18 '14 at 17:46
I understand you @Natalya, but for testing purposes I want to have at least a little program working so that I can project myself for something more. I sorted out my array but now I don't know how to implement above procedures that you showed me. — BAT, Mar 18 '14 at 17:49

score 3 · Answer 2 · edited May 23 '17 at 12:16

I think this can be done in O(n) time and O(log2(n)) memory complexities. Assuming that a BSR (highest set bit index) (floor(log2(x))) implementation in O(1) is used.

Algorithm:

1 create an array of (log2(MAXINT)) buckets, 20 in case of 10^6, Each bucket contains the sum and min values (init: min = 2^(i+1)-1, sum = 0). (lazy init may be used for small n)

2 one pass over the input, storing each value in the buckets[bsr(x)].

for (x : buffer) // iterate input
    buckets[bsr(x)].min = min(buckets[bsr(x)].min, x)
    buckets[bsr(x)].sum += x

3 Iterate over buckets, maintaining unreachable:

int unreachable = 1 // 0 is always reachable
for(b : buckets)
    if (unreachable >= b.min)
        unreachable += b.sum 
    else
        break
return unreachable

This works because, assuming we are at bucket i, lets consider the two cases:

unreachable >= b.min is true: because this bucket contains values in the range [2^i...2^(i+1)-1], this implies that 2^i <= b.min. in turn, b.min <= unreachable. therefor unreachable+b.min >= 2^(i+1). this means that all values in the bucket may be added (after adding b.min all the other values are smaller) i.e. unreachable += b.sum.

unreachable >= b.min is false: this means that b.min (the smallest number the the remaining sequence) is greater than unreachable. thus we need to return unreachable.

+1 Brilliant. I suggest adding a proof regarding the linear time complexity, otherwise it seems to be O(max(n,log MAXVAL)). It can be proven by examining the sorted sequence of numbers, and observing that as long as the sequence doesn't reach a "hole", any new value can make the sum advance at most one bucket ahead. In other words, the number of visited buckets is O(n). — Eyal Schneider, May 11 '14 at 20:03
Ok, I've changed the init of min to 2^(i+1)-1 instead of 0 in my orig answer. this allowed a simpler phase 2, and the phase 3 halts when a gap is encountered. — karpada, May 13 '14 at 20:30

Carlo M. · Accepted Answer · 2014-03-18T16:02:02.693

0

The last output is 4 because:

1 = 1
2 = 2
1 + 2 = 3
1 + 6 = 7
2 + 6 = 8
1 + 2 + 6 = 9

Therefore, the lowest integer that cannot be represented by any combination of your inputs (1, 2, 6) is 4.

What the question is asking: Part 1. Find the largest possible integer that can be represented by your input numbers (ie. the sum of all the numbers you are given), that gives the upper bound

UpperBound = sum(all_your_inputs) + 1

Part 2. Find all the integers you can get, by combining the different integers you are given. Ie if you are given a, b and c as integers, find:

a + b, a + c, b + c, and a + b + c

Part 2) + the list of integers, gives you all the integers you can get using your numbers.

cycle for each integer from 1 to UpperBound

for i = 1 to UpperBound if i not = a number in the list from point 2) i = your smallest integer break

This is a clumsy way of doing it, but I'm sure that with some maths it's possible to find a better way?

EDIT: Improved solution

//sort your input numbers from smallest to largest
input_numbers = sort(input_numbers)
//create a list of integers that have been tried numbers
tried_ints = //empty list
for each input in input_numbers
       //build combinations of sums of this input and any of the previous inputs
       //add the combinations to tried_ints, if not tried before
       for 1 to input
           //check whether there is a gap in tried_ints
           if there_is_gap
                //stop the program, return the smallest integer
                //the first gap number is the smallest integer

edited Mar 18 '14 at 16:02

answered Mar 18 '14 at 14:55

Carlo M.

331
1
3
12

So I Have to find the range in between those numbers which are represented by summation. In 1 2 6 the range of possible sums starts from 1 to 9 and the lowest number which cannot be represented is 4. Thanks @Voreno, that makes sense. I will upload my code once I have something working. – BAT Mar 18 '14 at 15:15
"Part 2" won't work with the constraints given: a set of 10^6 elements has 2^(10^6) subsets, it's quite a lot, so you can't just consider each of them. – Natalya Ginzburg Mar 18 '14 at 15:29
@NatalyaGinzburg You are perfectly right there, I have edited my post with a different proposed algorithm, it should work better now and avoid you trying all possible combinations endlessly. – Carlo M. Mar 18 '14 at 16:02
@Voreno still it can work practically endlessly - the number we are searching for can be rather big... – Natalya Ginzburg Mar 18 '14 at 16:11
@Voreno, can you please check my code above, I wrote some code but not sure how to proceed after creating a loop. Thank you! – BAT Mar 18 '14 at 16:14
@BatyrAtamamedov, actually this approach is wrong, I'd post another one, if there were a possibility to post answer (not a comment) :/ – Natalya Ginzburg Mar 18 '14 at 16:17
@Voreno, say, with numbers 1, 2, 4, ... 2^19, 10^6, 10^6, ..., 10^6 we can get numbers almost up to 10^12, and you cannot just check them all. – Natalya Ginzburg Mar 18 '14 at 16:19

score 0 · Answer 4 · answered Mar 18 '14 at 15:00

The output of the second input is 4 because that is the smallest positive number that cannot be expressed as a sum of 1,2 or 6 if you can take each item only 0 or 1 times. I hope this can help you understand more:

You have 3 items in that list: 1,2,6 Starting from the smallest positive integer, you start checking if that integer can be the result of the sum of 1 or more numbers of the given sequence.

1 = 1+0+0
2 = 0+2+0
3 = 1+2+0
4 cannot be expressed as a result of the sum of one of the items in the list (1,2,6). Thus 4 is the smallest positive integer which cannot be expressed as a sum of the items of that given sequence.

Thank you @Sandra now it's clearer. Now I am trying to write it in C++, I will post it below once I have something written. — BAT, Mar 18 '14 at 15:12

Finding the smallest possible number which cannot be represented as sum of 1,2 or other numbers in the sequence

4 Answers4