Why is one of these dynamic programming solutions to a pathfinding robot faster than the other?

Question

The Problem

"There's a robot in the top left corner of the grid. The robot can only move right, or down. The grid can contain invalid/blocked cells that the robot cannot step onto. Verify if there is a path to the bottom right cell in the grid from the top left."

The Solutions

I have two solutions, both of which use memoization in order to prevent re-doing the same work that you might do in a naive implementation.

Solution 1:

 1  def findPathCached( cache, grid, dstR, dstC, currentRow, currentColumn ):
 2     if currentRow > dstR or currentColumn > dstC or \
 3        grid[currentRow][currentColumn] == 1:
 4         return False
 5
 6     if currentRow == dstR and currentColumn == dstC:
 7         return True
 8
 9     if cache[currentRow][currentColumn] is None:
 10         cache[currentRow][currentColumn] = \
 11             findPathCached( cache, grid, dstR, dstC, currentRow, currentColumn+1 ) or \
 12             findPathCached( cache, grid, dstR, dstC, currentRow+1, currentColumn )
 13
 14     return cache[currentRow][currentColumn]

Solution 2:

 1 def findPathCachedFailed( cache, grid, dstR, dstC, currentRow, currentColumn ):
 2     if currentRow > dstR or currentColumn > dstC or \
 3        grid[currentRow][currentColumn] == 1:
 4         return False
 5
 6     if cache[currentRow][currentColumn]:
 7         return False
 8
 9     if ( currentRow == dstR and currentColumn == dstC ) or \
 10        findPathCachedFailed( cache, grid, dstR, dstC, currentRow, currentColumn+1 ) or \
 11        findPathCachedFailed( cache, grid, dstR, dstC, currentRow+1, currentColumn ):
 12         return True
 13
 14     cache[currentRow][currentColumn] = True
 15     return False

Solution 2 reliably runs faster than Solution 1. Doing some timing of each function call with time.time() in python, I can see over 10,000 runs the average run-time in seconds for both is:

Solution 1: 0.004197101092338562
Solution 2: 0.0036973851680755614

Running it by hand, Solution 2 very rarely takes more time than Solution 1. Both solutions are ran against the same grid.

I know the difference is small, but I had thought that Solution 1 would be better than Solution 2 since it caches every result, not just failed paths so I'm kind of surprised Solution 2 is so reliably better than 1.

Can anyone help me understand why Solution 2 is running faster?

Does it depend on how many "invalid/blocked cells" there are? — Heath Raftery, Jun 15 '19 at 00:12
Instead of manually writing caching code like this, try just writing `findPathCachedFailed(grid, dstR, dstC, currentRow, currentColumn)` with one of the memoization decorators; see e.g. [What is memoization and how can I use it in Python?](https://stackoverflow.com/questions/1988804/what-is-memoization-and-how-can-i-use-it-in-python). Also, what size (rows x cols) is grid? — smci, Jun 15 '19 at 00:12
@HeathRaftery both algorithms are ran against the same grid object. I've added this detail to the question. — screeb, Jun 15 '19 at 00:18
@smci This is just some code I wrote to learn more about dynamic programming. It's not actually being used anywhere. The grid is 200x200 cells. — screeb, Jun 15 '19 at 00:19
Also, how Python memoization handles the 2D arg `grid` may depend on whether it's a Python list-of-lists, or a 2D numpy ndarray. I don't know that memoization can handle either properly (i.e understand the actual values not just pointer aliasing), in particular almost surely not the contents of a numpy array. — smci, Jun 15 '19 at 00:56
@HeathRaftery Changing the proportion of invalid/blocked cells in the grid to be larger/smaller doesn't seem to have much of an effect, solution 2 still reliably comes out on top. — screeb, Jun 15 '19 at 01:21

score 3 · Answer 1 · edited Jun 20 '20 at 09:12

3

The correct way to figure this out is to run it under a profiler (though I don’t know if there is good Python profiler).

But here are some things I guess are probably less efficient in solution 1:

In solution 1, you first check whether you’ve reached the bottom-right cell, and if so, you return early. If you haven’t reached the bottom-right cell, you then check the cache and possibly skip some work. Since most cells are not the bottom-right cell, the bottom-right test usually does not cause an early return.

In solution 2, you first check the cache and possibly return early. If the cache check fails, then you check whether you’ve reached the bottom-right cell. So if the cache check hits often, you skip a lot of bottom-right checks that you perform in solution 1.
In solution 1, you fetch cache[currentRow][currentColumn] on line 9 and on line 14. In solution 2, you only fetch it on line 6. So solution 1 performs on the order of twice as many of these fetches as solution 2.

edited Jun 20 '20 at 09:12

Community

1
1

answered Jun 15 '19 at 00:19

rob mayoff

375,296
67
796
848

Ok, but manually caching `cache[currentRow][currentColumn]` just for marginal (<10%) performance gain is exactly the sort of code we're not supposed to write... memoization decorator approach is better. – smci Jun 15 '19 at 00:23
2

I’m not providing advice on how to write the code. I’m just pointing out what might be less efficient because that was the question. – rob mayoff Jun 15 '19 at 00:23
Right. Profiling is great, and your profiling insights are great. But we need to warn the OP off actually tweaking caching code manually. – smci Jun 15 '19 at 00:24

ruakh · Accepted Answer · 2019-06-15T16:57:36.680

The reason is actually very simple: when the function returns True, there's no point caching the result, because that cached result will never be read, because no more function calls will happen after that, because when a recursive call returns True (meaning "I've found a path to (dstR, dstC)"), the entire call stack quickly unwinds, with every single call immediately returning True (still meaning "I've found a path to (dstR, dstC)").

So this line of thought:

[…] I had thought that Solution 1 would be better than Solution 2 since it caches every result, not just failed paths […]

doesn't work, because those extra caches are just useless writes that will never be read (other than immediately, since as rob mayoff points out, you use return cache[currentRow][currentColumn] in Solution #1, but of course that could easily be changed to just return a local variable).

This is in part because of the short-circuiting behavior of or; in an expression like

findPathCached( cache, grid, dstR, dstC, currentRow, currentColumn+1 ) or \
findPathCached( cache, grid, dstR, dstC, currentRow+1, currentColumn )

the second function call won't happen if the first one returned True.

If you want Solution #1's extra caching to be useful, you might consider a problem where this short-circuiting is not possible; for example, instead of just returning True or False to indicate whether a path is possible, try returning how many paths are possible — so, 0 instead of False, a positive number instead of True, and + instead of or. You'll suddenly find that Solution #1 is much faster, because it can cover the whole grid in time proportion to the size of the grid, whereas Solution #2 will repeat huge numbers of calls and take much longer.

Incidentally, you can solve this problem using only O(min(m, n)) extra space, instead of O(mn) extra space, with a slightly different approach that still requires only O(mn) time. Specifically: you can iterate over the rows of the grid, and for each one, create a list of which cells in the row are reachable from (0, 0). The list for a given row only depends on the list from the previous row, so you don't need to keep all old lists as you iterate. (That's actually O(n) extra space, but I say you can do it in O(min(m, n)) extra space because you can start by checking which dimension is bigger and iterating over columns instead of rows if it turns out that rows are longer than columns.)

score 0 · Answer 3 · answered Jun 15 '19 at 01:41

This is faster

 6     if cache[currentRow][currentColumn]:
 7         return False

than

 6     if currentRow == dstR and currentColumn == dstC:
 7         return True

It only checks if object exists i think. I do not see comparing values. I also think it returns quicker, and stops rest of code from excecution.

I also think, this line from solution '1' should be also faster than '2', in '2' you got 2 comparison and 4 boolean operations

 9     if cache[currentRow][currentColumn] is None:

Technically speaking, you got 2 ways of optimizing this, but checking operations on list (matrixes) or just correcint if statements.

Keep in mind, that some solutions can call 'return' more quicker.

Without profiler, I would simply test one feature after another on some basic exmaples :)

Why is one of these dynamic programming solutions to a pathfinding robot faster than the other?

The Problem

The Solutions

3 Answers3