3

I don't believe this is efficient at all. I'm trying to create a faster implementation of this (preferably binary search or use of Sets perhaps), but for now this is where I'm at. I'm not sure if it's relevant, but I created a count variable just to see how many times the method was being called. It came to 577 times.

This is the "Subset Sum" algorithm, where I have to display all the subsets that add to a target sum, in this case 3165. It wasn't specifically stated that this problem was the algorithm, but I realized it was indeed the same concept.

My question is how can I know how efficient this program is, and are the method calls an indicator?

public class SubsetSumAlgorithm {

    static int count = 0;

    public static void main(String[] args) {

        int[] array = {26, 39, 104, 195, 403, 504, 793, 995, 1156, 1673};

        System.out.println("COLLECTIONS IN ARRAY THAT ADD TO 3165: ");
        findCollections(array, 0, 0, 3165, "");
        System.out.println("Method called " + count + " times.");//method calls
    }

    public static void findCollections(int[] array, int index, int currentPosition, int sum, String collection) {

        count++; //<---COUNTING THE METHOD CALLS HERE

        if (array.length < index || currentPosition > sum) {
            return;
        }
        //if sum is found, add to subset
        for (int i = index; i < array.length; i++) {
            if (currentPosition + array[i] == sum) {
                System.out.println(collection + " " + array[i]);
            }
            //otherwise, call the method again
            else if (currentPosition + array[i] < sum) {//recursive call
                findCollections(array, i + 1, currentPosition + array[i], sum, collection + " " + array[i]);
            }
        }
    }
}

And here is the output:

COLLECTIONS IN ARRAY THAT ADD TO 3165: 
26 195 793 995 1156
195 504 793 1673
Method called 577 times.
Makoto
  • 104,088
  • 27
  • 192
  • 230
DevOpsSauce
  • 1,319
  • 1
  • 20
  • 52
  • You may want to see http://stackoverflow.com/questions/4640034/calculating-all-of-the-subsets-of-a-set-of-numbers – AMACB Mar 04 '16 at 19:26
  • The only big o calculation I saw in that post was 2^n, but that was using Sets. – DevOpsSauce Mar 04 '16 at 19:33
  • This is inevitably going to be exponential time and slow. If you find an efficient solution to this problem, you will win a million dollars, eternal fame in computer science academia, and you will destroy the Internet. – Louis Wasserman Mar 04 '16 at 19:39
  • @LouisWasserman Hahaha! Maybe someday. – DevOpsSauce Mar 04 '16 at 19:51
  • 2
    @LouisWasserman Actually dynamic programming solves this one in pseudo-polynomial time. Which is to say that there are still exponentially slow cases, but MOST of the time it will return quickly. In fact for integers no more than the size of the set times the range from the minimum sum to the maximum sum. – btilly Mar 04 '16 at 20:02
  • @btilly, thanks for the hint. I implemented that algorithm [here](https://gist.github.com/kedarmhaswade/cfaedb1cb38c28053b42). If there are `n` elements in the set and `S` is the number to express as a sum of elements of the set, then this algorithm has a time complexity of `O(S.n)` (I believe it could be tighter) and memory requirements of `O(S).a(n)`, where `a(n)` is average size of a subset of an `n-set`. Is that right? – Kedar Mhaswade Mar 13 '16 at 18:28

2 Answers2

2

My question is how can I know how efficient this program is, and are the method calls an indicator?

The only way to derive the asymptotic run time of an algorithm is to actually get your hands dirty and do it manually. With this being said, trying to keep track of method calls while your algorithm runs isn't a reliable or reasonable way to derive the runtime of an algorithm.

To start trying to figure out how fast or slow your algorithm runs, we can start by deriving the recurrence from analyzing the run time of each line:

1    public static void findCollections(int[] array, int index, int currentPosition, int sum, String collection) {
2        if(array.length < index || currentPosition > sum)
3            return;
4        for(int i = index; i < array.length; i++) {
5            if(currentPosition + array[i] == sum) {
6                System.out.println(collection + " " + array[i]);
7            }
8            else if(currentPosition + array[i] < sum) {
9                findCollections(array, i + 1, currentPosition + array[i], sum, collection + " " + array[i]);
10           }
11       }
12   }

...

1    -
2    1
3    1
4    n+1
5    n
6    n
7    -
8    n
9    n*T(n-1)
10   -
11   -
12   -

One we've analyzed each line, we can derive the recurrence:

    T(n) = n*T(n-1) + 4n + 3
 => T(n) = n*T(n-1) + 4n + Θ(1)
 => T(n) = n*T(n-1) + 4n

Since we cannot use the Master Theorem, and attempting a recursion tree would get very messy and confusing very fast with this particular recurrence, we can solve this recurrence using the substitution method. We will set our initial guess to be 2^n since it looks to me like it'll probably be exponential:


Upper Bound O(2^n)


This would imply that

T(n) ≤ c * 2^n

if this proposition holds true, this means we also know that

T(n-1) ≤ c * 2^(n-1)

Therefore, we can now write our recurrence and attempt to prove our guess:

    c * 2^n ≥ n * (c * 2^(n-1)) + 4n
 => c * 2^n ≥ c * n * 2^(n-1) + 4n

this holds true for all n > 0 | c = 1, therefore T(n) = O(2^n)


Lower Bound Ω(2^n)


This would imply that

T(n) ≥ c * 2^n

if this proposition holds true, this means we also know that

T(n-1) ≥ c * 2^(n-1)

Therefore, we can now write our recurrence and attempt to prove our guess:

    c * 2^n ≤ n * (c * 2^(n-1)) + 4n
 => c * 2^n ≤ c * n * 2^(n-1) + 4n

this holds true for all n > 0 | c = 5000, therefore T(n) = Ω(2^n)


Conclusion Θ(2^n)


Since we've prove that the algorithm is both O(2^n) and Ω(2^n), therefore it is also, by definition, Θ(2^n). So basically, your algorithm's time complexity is Θ(2^n) which is very slow as it is exponential.

Nick Zuber
  • 5,467
  • 3
  • 24
  • 48
  • 1
    EXCELLENT answer! I hope to improve this algorithm eventually, but thank you for helping me see where it currently stands. – DevOpsSauce Mar 04 '16 at 22:37
  • @csheridan Glad I could help and best of luck! – Nick Zuber Mar 04 '16 at 22:39
  • I'm a little lost on your calculation of n * T(n - 1) on line 9 of your analysis. – DevOpsSauce Mar 04 '16 at 23:46
  • Also, what is the variable "c"? – DevOpsSauce Mar 04 '16 at 23:52
  • @csheridan So the `for` loop will loop (worst case) `n` times, therefore the actual line which does the `for` loop call will run `n+1` times (to account for the final check before deciding we're done looping). So knowing this, everything within the `for` loop will be called `n` times. For the `else if` block within this `for` loop, we assume the worst case which is that it is always true, which means your function (we refer to as `T(n)`) gets called `n * T(n-1)` times. – Nick Zuber Mar 04 '16 at 23:52
  • @csheridan `c` refers to some arbitrary constant. This is given by the definition of Big O, Big Omega, and Big Theta – Nick Zuber Mar 04 '16 at 23:53
  • Alright, I believe I'm following you. I'm still new to measuring time complexity so I appreciate your insight. – DevOpsSauce Mar 04 '16 at 23:56
  • @csheridan I'm still relatively new to analyzing asymptotic time complexities myself, doing this problem out was a fun challenge =) – Nick Zuber Mar 04 '16 at 23:58
1

The number of calls to a Method are a decent indicator if you want to find the trend experimentally. You would need to run the function many times with varying inputs, collect the data, graph it, and find a best fit curve... This would assume that calls to findCollections directly correlate with the time it takes to execute. However, finding the worst case here isn't too hard. You just have to think about what input variables would cause your execution to go on the longest.

In this case if the sum variable is larger than the sum of the whole set you will see the worst execution (the most recursion). If this condition is true, then the method bellow would be equivalent to yours:

public static void findCollections(int[] array, int index, int currentPosition, int sum, String collection) {
    for(int i = index; i < array.length; i++) {
        findCollections(array, i + 1, currentPosition + array[i], sum, collection + " " + array[i]);
    }
}

Its easy to see here that this would cycle through all possible sub-sets of array. This makes your solution O(2^n) where n is the size of array

flakes
  • 21,558
  • 8
  • 41
  • 88