Python function returns different result with memoization

Question

My Python pathfinding function returns different results if I use a memoizing decorator. It returns a correct value on its own, but after being memoized, it returns an incorrect value.

The function I'm talking about looks like this:

@functools.lru_cache(maxsize=None)
def recursiveTraversal(originIndex, target, steps):
    startingVertex = data[originIndex]

    if startingVertex["ID"] == target:
        if steps == 0:
            return Path(0, [])
        else:
            return None
    else:
        if steps == 0:
            return None
        else:
            potentialPaths = []
            for i in startingVertex["Edges"]:
                nextVertex = data[i]
                nextPath = recursiveTraversal(i, target, steps - 1)
                if nextPath == None:
                    continue
                nextPath.weight += int(nextVertex["Weight"])
                nextPath.vertices.append(i)

                potentialPaths.append(nextPath)
            if len(potentialPaths) > 0:
                minPath = min(potentialPaths, key=lambda x: x.weight)
                return minPath
            else:
                return None

A complete runnable example can be found here. The upper part of the file is all data, and the code is at the bottom. To reproduce this, just comment out line 15, and observe that the output is different.

How can I get the memoized version to output the same thing as the normal version?

Can you give us a [mcve]? First, that input should be part of the question, not a comment. More importantly, if you just run `recursiveTraversal(0, "1", 3)` on the code we're given, it's obviously just going to raise a `NameError` on `data`. — abarnert, Jul 29 '18 at 22:46
Also, you say the data here wasn't changed at all… but aren't those `nextPath` values just references to some node in `data`? From a quick glance, I don't see anywhere that could be returning anything else. And you immediately modify whatever's returned by changing its weight and appending another vertex. — abarnert, Jul 29 '18 at 22:49
`nextVertex ` is a reference to `data`, but I'm actually changing `nextPath`, which is a Path object, not a reference to `data`. — Reubend, Jul 29 '18 at 22:53
I've just updated this with a runnable example (complete with `data` and the `Path` class). — Reubend, Jul 29 '18 at 23:19

score 1 · Accepted Answer · answered Jul 30 '18 at 01:31

The problem is that you are modifying attributes of the return values of recursiveTraversal. This function returns Path objects and you modify their attributes weight and vertices. So for the non-cached version each time you call the function with (x, y, z) arguments a fresh Path(0, []) object will be created and its attributes are modified later on in the for loop. But for each (x, y, z) call you are assured to start from a fresh object. Now for the cached version, instead of providing a fresh object by going all the way down the recursive tree, the cache wrapper just provides you with the instance of a previously created Path object (which already has modified weight and vertices attributes) and these are modified even further (i.e. this modifies the cache). This can be seen from the following example:

# Augment `Path` class with `__repr__`.
class Path:
    # Other methods go here.

    def __repr__(self):
        return '{}({}, {})'.format(self.__class__.__name__, repr(self.weight), repr(self.vertices))

data = [
    {'ID': '2', 'Weight': 1, 'Edges': [1]},
    {'ID': '1', 'Weight': 1, 'Edges': []}
]

print(recursiveTraversal(0, '1', 1))  # Prints "Path(1, [1])".
print(recursiveTraversal(1, '1', 0))  # Prints "Path(1, [1])".

Checking the function recursiveTraversal is seems that for steps=0 it should return Path(0, []) in case the target matches. Nevertheless it returns Path(1, [1]). This happens because the previous call to recursiveTraversal already invoked recursiveTraversal(1, '1', 0) and modified the result's weight and vertices attributes. When performing the second explicit call to recursiveTraversal(1, '1', 0) you'll get back the cached reference to that object.

Possible solution

A possible solution is to create copies of cached objects before modifying them further. This prevents that the cache gets modified.

from copy import deepcopy

# Within `recursiveTraversal`:
# ...
nextPath = deepcopy(recursiveTraversal(i, target, steps - 1))
# ...

Thank you so much, this is exactly right! For anyone else who encounters this, it seems the cache stores _references_ to the return values, not the return values themselves. That's the root of my problem. — Reubend, Jul 30 '18 at 01:47
@Reubend The cache stores *exactly* what the function returns. However, in Python, the notion of a variable is different than from example in C++. [This post](https://stackoverflow.com/q/986006) and [this post](https://stackoverflow.com/q/11007627/3767239) will give you more insight. — a_guest, Jul 30 '18 at 06:51

Python function returns different result with memoization

1 Answers1

Possible solution