0

I am learning backtracking algorithms and I have written a program of subset sums. Here's the program:

items = range(1, 10)
result = []
def backtracking(array, target, temparray= []):
    if target == 0:
        result.append(temparray)
        return

    for index, item in enumerate(array):
        if item > target:
            return
        temparray.append(item)
        backtracking(array[index+1:], target - item, temparray[:])
        temparray.pop()
    return

backtracking(items, 8)
[print(arr) for arr in result]

That is to get a sum of 8 from the values of 1 to 9 (No repetition), we get this output:-

[1, 2, 5]
[1, 3, 4]
[1, 7]
[2, 6]
[3, 5]
[8]

The output is well and good but when I do same for any number above 50, it keeps on running like forever. I waited almost an hour and still couldn't get the result so I Had to terminate the programme abruptly. Same has been my experience with other backtracking problems when the values tend to get close to 40-50. Seems like backtracking is inherently very slow as it involves recursion and lots of function calls.

I want to ask if there is anything to speed things in general while implementing backtracking algorithms. Is there any other more efficient algorithms to be used in its place. Memoization doesn't work as list as arguments are not hashable.

I have devised some cool solutions to some of the Project Euler questions but am unable to get an answer for larger values. I know my algorithm is correct but is very slow. Please help.

Gsbansal10
  • 303
  • 5
  • 12
  • 3
    Just a note, using a mutable object like a list as a default argument is a recipe for pain. See [this post](https://stackoverflow.com/questions/1132941/least-astonishment-and-the-mutable-default-argument). – wbadart Dec 06 '18 at 18:09
  • The indentation of your code sample is not quite right. – Tomalak Dec 06 '18 at 18:13

1 Answers1

0

If you are asking for a sum from unique numbers 1 through N, in the worst case (when the requested sum is >= sum(range(1, N+1)) )the algorithm you use will iterate through all subsets of range(1, N+1) before it gives up. The number of subsets of a set of N items is 2**N--this is intuitive because each item in the original set can either be included or excluded, hence 2 possibilities for each item.

The algorithm is O(2**N). For N=16, in the worst case it will examine 65,536 possibilities. If N is 50, in the worst case it will examine 1,125,899,906,842,624 possibilities.

The algorithm has an early out when a subsum exceeds the desired target, so the worst case does not always apply, and results are faster for small targets. The analysis given here (https://www.geeksforgeeks.org/subset-sum-problem/) suggests that reduces the complexity to O(2**(N/2)) on average. for N=50, that implies 33,554,432 iterations on average, with potentially far more.

Backtracking effectively doubles the value of N that can be computed with a given number of iterations, but it doesn't eliminate the exponential nature of the algorithm.

All that said, one thing that is unnecessarily slowing down your solution is the use of "array[index+1:]". In Python, the slice notation creates a new distinct list and copies the items into it, rather than just creating a slice reference as in some other languages (e.g., Go). This is fundamental to the language, since the recipient of the slice is free to modify the contents of the slice without the original list being modified. In your case, the copy is unnecessary, since the contents of array are never modified. It would be an improvement instead to pass the same array on recursion along with an index to the relevant starting point.

Of course, that would not solve the big issue, which is the exponential complexity of the problem.

Sam
  • 101
  • 1
  • 5