Breadth First Search Optimisation

Question

I've been working on some introductory Graph Theory lately, and came across the following problem statement:

Given a 2D matrix as input, representing a maze (where 'E' is the start of the matrix and 'S' is the end), find the length of the shortest path from E to S. Walls in the maze are represented by '#', while accessible spaces are indicated with '.'. It's also a given that the outer edge of the matrix is covered with walls. If no path exists from E to S, return -1.

Since the graph isn't weighted, I tried implementing a BFS algorithm, using deque. However, when the mazes start hitting around 500x500, execution time hits 10s, and when I try to go up to 1000x1000, it's obviously a lot worse.

Here's my code:

from collections import deque

def shortest_path_1(maze):
    wall, clear, endchar, startchar = '#', '.', 'S', 'E'
    height = len(maze)
    width = len(maze[0])

    def find_start(grid):
        for y in range(1, height-1):
            for x in range(1, width-1):
                if grid[y][x] == startchar:
                    return tuple([x, y])

    start = find_start(maze)

    queue = deque([[start]])

    seen = {start}
    while queue:
        path = queue.popleft()
        x, y = path[-1]
        if maze[y][x] == endchar:
            return len(path)-1
        for (x2, y2) in ((x + 1, y), (x - 1, y), (x, y + 1), (x, y - 1)):
            if 0 < x2 < width-1 and 0 < y2 < height-1 and maze[y2][x2] != wall and (x2, y2) not in seen:
                queue.append(path + [(x2, y2)])
                seen.add((x2, y2))

    return -1

I found some very useful answers on the site so far, but none of the current ones seem to give any other optimizations that I haven't implemented yet...

Thanks!

EDIT: Thanks to the lovely person who edited my question to make the key words pop :). Here's an example of a matrix that you can run the algorithm on:

#####
#E#S#
#.#.#
#...#
#####

This should return the value 6.

EDIT2: Fixed some small mistakes.

Consider a [more efficient path representation](https://stackoverflow.com/questions/45976278/fastest-way-to-create-multiple-copies-of-a-list/45976467#45976467). — user2357112, Oct 17 '18 at 19:10
(Also, `seen = set(start)` builds a set out of `start`'s elements, which is not the set you want.) — user2357112, Oct 17 '18 at 19:11
Also, you reused the `start` variable, and you seem to have the start and the end switched. — user2357112, Oct 17 '18 at 19:12
If the code is error-free, it might be a better fit for [code review](https://codereview.stackexchange.com/). — timgeb, Oct 17 '18 at 19:13
@rarblack: Why are you going around suggesting edits to make things bold? — user2357112, Oct 17 '18 at 19:14
@user2357112 I don't think I have them switched, they're from French, where 'E' stands for "entrée" and 'S' stands for "sortie" (meaning entrance and exit, respectively). What should I do instead of `seen = set(start)` then? And thanks for pointing out the fact I reused start, I renamed the end and start characters to "endchar" and "startchar". — Peiffap, Oct 17 '18 at 19:18
Actually, you're not returning the path at all. You don't need to store the path more efficiently; you need to completely stop storing the path. — user2357112, Oct 17 '18 at 19:21
I thought about that, but then how would I implement it? That sounds like it could be a major speed up. — Peiffap, Oct 17 '18 at 19:23
You can also use a different algorithm entirely: [jump-point search](https://harablog.wordpress.com/2011/09/07/jump-point-search/) is good for uniform-cost, square grids, such as this one. I've written a Python implementation [here](https://github.com/c2huc2hu/jps/tree/refactor) (use this branch, not master). I haven't had time to optimize it, but it runs a 500x500 grid in 3.7 seconds on my laptop. If you just want to benchmark that test, `python3 -m unittest test.test_small.JPSSmallTests.test_large` will run a 500x500 grid — c2huc2hu, Oct 17 '18 at 19:45

fafl · Accepted Answer · 2018-10-18T16:57:20.050

As suggested in the comments, you don't have to store the paths. Try this:

from collections import deque

def shortest_path_1(maze):
    wall, clear, endchar, startchar = '#', '.', 'S', 'E'
    height = len(maze)
    width = len(maze[0])

    def find_start(grid):
        for y in range(1, height-1):
            for x in range(1, width-1):
                if grid[y][x] == startchar:
                    return (x, y, 0)

    start = find_start(maze)

    queue = deque([start])

    seen = set()
    while queue:
        x, y, d = queue.popleft()

        if not 0 <= x < width:
            continue
        if not 0 <= y < height:
            continue
        if maze[y][x] == wall:
            continue

        if maze[y][x] == endchar:
            return d

        if (x, y) in seen:
            continue
        seen.add((x, y))

        queue.append((x+1, y, d+1))
        queue.append((x-1, y, d+1))
        queue.append((x, y+1, d+1))
        queue.append((x, y-1, d+1))

    return -1

maze = [x for x in """
#####
#E#S#
#.#.#
#...#
#####
""".split('\n') if x]

print shortest_path_1(maze)

This solves a *1000x1000* maze in a couple seconds! Not saving the path was the thing to do indeed, thank you! And thanks to @user2357112 for suggesting that and other changes! — Peiffap, Oct 17 '18 at 19:52

Breadth First Search Optimisation

1 Answers1