1

For an online course I'm trying to solve the knapsack problem: given n items and arrays v and w of their values and weights, respectively, find the combination of items with the maximum value subject to the constraint that the total weight is less than a given threshold W.

I define the solution to the sub-problem A(i, x, v, w) with i items and a total weight constraint x; the problem is thus to compute A(n, W, v, w). I've tried the following:

import sys
sys.setrecursionlimit(3000)

def memoize(f):
    memo = {}
    def helper(i, x, *args, **kwargs):
        if (i, x) not in memo:            
            memo[(i, x)] = f(i, x, *args, **kwargs)
        return memo[(i, x)]
    return helper

@memoize
def A(i, x, v, w):
    if i == 0:
        return 0
    if x >= w[i-1]:
        return max(A(i-1, x, v, w), A(i-1, x-w[i-1], v, w) + v[i-1])
    else:
        return A(i-1, x, v, w)

If I run this on a simple test case, it works:

import pytest

@pytest.fixture
def data_simple():
    W = 6
    n = 4
    v = [3, 2, 4, 4]
    w = [4, 3, 2, 3]
    return W, n, v, w

def test_A(data_simple):
    W, n, v, w = data_simple
    assert A(n, W, v, w) == 8

if __name__ == "__main__":
    pytest.main([__file__, "-s"])

where in this case the maximum value is attained by taking the last two items, each with value 4.

For the actual problem, however, we need to use a bigger set of input data obtained from knapsack_big.txt, where W = 2,000,000 and n = 2,000.

I've tried to run this as follows:

def readfile(file):
    with open(file) as f:
        first_line = f.readline().strip()
        W, n = tuple(map(int, first_line.split()))
        data = [tuple(map(int, line.split())) for line in f.read().splitlines()]
        v, w = zip(*data)
        assert len(v) == n
        return W, n, v, w

@pytest.fixture
def data_big():
    return readfile('knapsack_big.txt')

def test_big(data_big):
    W, n, v, w = data_big
    optimal_value = A(n, W, v, w)

However, when I try to run this I get a RecursionError:

    def helper(i, x, *args, **kwargs):
        if (i, x) not in memo:
>           memo[(i, x)] = f(i, x, *args, **kwargs)
E           RecursionError: maximum recursion depth exceeded

I don't understand this: if every time the function A is called, it is with a decreased value of i, and the maximum value of i is 2000, then shouldn't a recursion depth of 3000 (as set in the beginning) be sufficient?

(What I suspect might be the case is that the memoize function wrapper is getting called every time A is invoked, so it is creating an empty memo dictionary every time instead of filling the dictionary. I based it on an example from http://www.python-course.eu/python3_memoization.php, but perhaps I should instead try some of the implementations at https://wiki.python.org/moin/PythonDecoratorLibrary#Memoize?)

Cœur
  • 37,241
  • 25
  • 195
  • 267
Kurt Peek
  • 52,165
  • 91
  • 301
  • 526
  • Without analyzing your code and ignoring memoization: you seem to classify your recursion as something with linear growth, but the way i understand this: the recursion-depth error is based on number of call stack elements. The [call stack](https://stackoverflow.com/questions/26873627/why-does-python-have-a-maximum-recursion-depth) collects active functions. Your recursive function potentially calls itself two times leading to exponential blowup. This might result in exponential blowup of active functions as i would assume. – sascha Sep 07 '17 at 00:40
  • @sascha: the number of calls goes up exponentially, but the maximum depth of call is linear. The function calls itself twice - but only once at a time. – Hugh Bothwell Sep 07 '17 at 01:01
  • As I understand it, in general the recursive function has an exponential runtime, no better than brute-force. However, the use of memoization is supposed to change this to linear time O(n). (In a 'warm-up' problem, I actually solved the same problem using an explicitly linear-time 'bottom-up' approach in which `A` is an array, not a function; however for this 'big' problem this would require an array containing ~4 billion integers. I've read on the course forum that using memoization would require less memory, as the cache is only computed 'as needed'). – Kurt Peek Sep 07 '17 at 06:52

1 Answers1

2

It falls with Maximum recursion depth exceeded because every time you call the function A helper function is also called, so actually you have two recursive calls for every decrementation of the i. I tried to remove call to helper by implementing memoization inside the function A and it worked:

memo = {}
def A2(i, x, v, w):
    global memo
    if i == 0:
        return 0
    if (i, x) not in memo:
        if x >= w[i - 1]:
            memo[(i, x)] = max(A2(i - 1, x, v, w),
                               A2(i - 1, x - w[i - 1], v, w) + v[i - 1])
        else:
            memo[(i, x)] = A2(i - 1, x, v, w)
    return memo[(i, x)]

Output:

4243395

Aleksei Shestakov
  • 2,508
  • 2
  • 13
  • 14