"Confused Digger" algorithm running too slow, any way to speed it up?

Question

I have an algorithm that was based on a backtracking maze algorithm with some parts removed, resulting in this wonderful dungeon. Unfortunately, it runs extremely slowly which makes it almost impossible to fill a decently-sized map with. I really don't want to throw it out, but I can't think of any way to speed it up. I'm using Python, so I know that's part of my problem, but I'm not exactly prepared to throw out my roguelike's entire existing codebase, which runs fast enough right now. Here's the code currently:

    start = (random.randint(0, self.width), random.randint(0, self.height))
    self.dungeon['up_stairs'] = start

    visited = [start]
    while len(visited) < 300:
        current = visited[-1]
        apos = random.choice([(1, 0), (0, 1), (0, -1), (-1, 0)])
        new = utils.tuple_add(current, apos)
        if not new in visited and self.on_map(new):
            self.place_cell(new, is_wall=False)
            visited.append(new)
        else:
            visited.pop()

[edit]

To answer some questions in the comments, place_cell either creates a wall or an empty cell at the supplied position-tuple based on the positional argument is_wall. So for instance, in the code above, the self.place_cell(new, is_wall=False) call changes the cell at the position new on the map to an empty cell.
Visited should really be called something else, I'm just... lazy that way. I'll fix it later probably.
The < 300 condition is because 299 cells is the most it can draw in a reasonable time frame. (Beyond 299 cells, it suddenly starts hanging.)

The **first** thing you should do is profile your script. It's easy. See [**_How can you profile a script?_**](http://stackoverflow.com/questions/582336/how-can-you-profile-a-script) — martineau, Feb 10 '17 at 23:25
"In time" is so relative here... OP uses `width` and `height`, so unless there's something wrong with `place_cell` which isn't visible here, then the code is ok and should run without time issues. If the `place_cell` is ok, why shouldn't the code be posted on code review is a mystery to me. — Peter Badida, Feb 10 '17 at 23:26
I use place_cell in a standard rouge room-and-corridor algorithm and it works just fine. FYI I answered some questions in the post above. — Alexis Dumas, Feb 10 '17 at 23:29
I think the `pop` part makes it slow as it's popping out the last inserted item to visited. So a series of bad random choices actually removes a series of added items. I don't know about the algorithm though or if it's designed that way for reasons. — Fallen, Feb 10 '17 at 23:31
This algorithm tries to create a path of length 300 by picking random directions and extending the path in those directions; if it ever runs into a cell already on the path (including if it turns around and tries to move to where it just was), it instead removes the last path segment. It stops when it hits a path length of 300, and every cell it ever tried to add to the path becomes open space. — user2357112, Feb 10 '17 at 23:33
FYI you can see the whole project at: https://github.com/christopherdumas/LMS. Also, I want it to backtrack from dead ends and try a new direction (that's what pop does) because if it doesn't (and I haven't tried this so I'm not sure) I think it will just end up with one long snaky line. — Alexis Dumas, Feb 10 '17 at 23:33
This is going to do a ton of random flailing about, shutting itself off in an enclosure and randomly banging into the enclosure boundaries until it manages to backtrack out, cutting path segments off because it tried to turn around and encountered the path it just laid, etc. Unfortunately, it doesn't look like there's a good way to get the same behavior in terms of dungeons generated without all this slow flailing. You'll probably have to use a different algorithm that generates dungeons with a different feel to them. — user2357112, Feb 10 '17 at 23:36
Compare your algorithm to one found in [Wikipedia](https://en.wikipedia.org/wiki/Maze_generation_algorithm#Python_code_examples). There is a small but important difference. Your algorithm chooses a random direction and then checks if it's valid; the Wikipedia algorithm checks valid directions and then chooses at random between them. In addition explore other algorithms on that page. — n. m. could be an AI, Feb 10 '17 at 23:43
@martineau When you need to reduce algorithm complexity, profiling is the last thing you want to do. — n. m. could be an AI, Feb 10 '17 at 23:48
It seems that your algorithm will pop more the more dense the visited array is. So you're depending on probability to reach the limit. And it decreases, the bigger limit you choose. That's a big flaw. I don't know why you don't want to just implement a standard backtracking algorithm. Do you wan't it to be unicursal? — Sopel, Feb 11 '17 at 00:11
@n.m.: That's far from the only difference, and changing just that part would produce an even more broken algorithm or really dull dungeons. Other important differences are that we're checking whether new cells are on the path instead of whether we've ever visited them, and that there don't seem to be any walls *between* cells here, just cells that are walls and cells that aren't walls. This algorithm isn't intended to match that maze generation algorithm. It's not even intended to be a maze generator at all. — user2357112, Feb 11 '17 at 00:14
@KeyWeeUsr besides running too slowly, OP also says that it hangs for mazes larger than 300, so it's not working code. — samgak, Feb 11 '17 at 00:45
@user2357112 oh yes, my bad, the `visited` array is not really "visited" at all. This is more important. The other one is not that important, either variant would work. — n. m. could be an AI, Feb 11 '17 at 01:05
@n.m.: How can you assume the algorithm's complexity needs to be reduced? Profiling could well tell you that, as well as what portion of it to concentration on. It may also indicate the need to find a completely different algorithm/approach. The main point being it will provide you with important information about what parts of the code are consuming the most of the execution time. — martineau, Feb 11 '17 at 02:07
@martineau I don't assume a thing, I look at the complexity and see that it's exponential. — n. m. could be an AI, Feb 11 '17 at 02:18

trincot · Accepted Answer · 2017-02-11T01:28:17.400

I would suggest following improvements:

Do not pop a visited cell only because you failed on one step. This may lead to starvation: popping and trying again, popping ... etc. You should instead pick another cell you visited in the past, and continue from there. If that fails, pick another one...
Use a set to keep track of the visited cells: this will allow faster lookup
Use another "frontier" set for keeping track of which visited cells still have unvisited neighbours
When a cell is visited, check all neighbours in random order whether they are unvisited. The first one that is will be your next visit. But if all of them were visited (or outside the grid) then choose a random cell from the frontier set.

Here is the suggested code:

start = (random.randint(0, self.width-1), random.randint(0, self.height-1))
self.dungeon['up_stairs'] = start

current = start
frontier = set()
visited = set([start])
while len(visited) < 400:
    frontier.add(current)
    self.place_cell(current, is_wall=False)
    found = False
    while not found:
        choices = [(1, 0), (0, 1), (0, -1), (-1, 0)]
        random.shuffle(choices)
        for apos in choices:
            new = utils.tuple_add(current, apos)
            if not new in visited and self.on_map(new):
                found = True
                break
        if not found:
            # we can remove this cell from frontier as it is 
            # completely surrounded by visited cells
            frontier.discard(current)
            # pick a random other one to start from
            current = random.sample(frontier, 1)[0]
    current = new    
    visited.add(current)

This works like a charm. I had started implementing an algorithm very similar to this based on some comments on the original post, but this is a whole lot faster and more elegant. The use of sets is really cool. — Alexis Dumas, Feb 11 '17 at 01:56

"Confused Digger" algorithm running too slow, any way to speed it up?

1 Answers1