How many moves to reach a destination? Efficient flood filling

Question

I want to compute the distance of cells from a destination cell, using number of four-way movements to reach something. So the the four cells immediately adjacent to the destination have a distance of 1, and those on the four cardinal directions of each of them have a distance of 2 and so on. There is a maximum distance that might be around 16 or 20, and there are cells that are occupied by barriers; the distance can flow around them but not through them.

I want to store the output into a 2D array, and I want to be able to compute this 'distance map' for any destination on a bigger maze map very quickly.

I am successfully doing it with a variation on a flood fill where the I place incremental distance of the adjacent unfilled cells in a priority queue (using C++ STL).

I am happy with the functionality and now want to focus on optimizing the code, as it is very performance sensitive.

What cunning and fast approaches might there be?

An example map for all cells within four moves of the target x; black denotes barrier

You can use simple `queue` instead of `priority queue` because you iterate over cells in increasing order of distance from the start cell. — Ivan Bianko, Nov 14 '11 at 11:02

score 4 · Answer 1 · edited May 03 '23 at 12:24

I think you have done everything right. If you coded it correct it takes O(n) time and O(n) memory to compute flood fill, where n is the number of cells, and it can be proven that it's impossible to do better (in the general case). And after the fill is complete you just return the distance for any destination with O(1), it is easy to see that it also can be done better.

So if you want to optimize performance, you can only focus on code local optimization. This will not affect the asymptotic complexity but can significantly improve your real execution time. But it's hard to give you any advice for code optimization without actually seeing the source code.

So if you really want to see optimized code see the following (Pure C):

#include <stdlib.h>

int* BFS()
{
    int N, M; // Assume we have NxM grid.
    int X, Y; // Start position. X, Y are unit based.
    int i, j;
    int movex[4] = {0, 0, 1, -1}; // Move on x dimension.
    int movey[4] = {1, -1, 0, 0}; // Move on y dimension.
    
    // TO DO: Read N, M, X, Y
    
    // To reduce redundant functions calls and memory reallocation 
    // allocate all needed memory once and use a simple arrays.
    int* map = (int*)malloc((N + 2) * (M + 2) * sizeof(int)); 
    int leadDim = M + 2;
    // Our map. We use one dimension array. map[x][y] = map[leadDim * x + y];
    // If (x,y) is occupied then map[leadDim*x + y] = -1;
    // If (x,y) is not visited map[leadDim*x + y] = -2;

    int* queue = (int*)malloc(N * M * sizeof(int));
    int first = 0, last =1; 
    
    // Fill the boarders to simplify the code and reduce conditions
    for (i = 0; i < N+2; ++i)
    {
        map[i * leadDim + 0] = -1;
        map[i * leadDim + M + 1] = -1;
    }
    
    for (j = 0; j < M+2; ++j)
    {
        map[j] = -1;
        map[(N + 1) * leadDim + j] = -1;
    }
    
    // TO DO: Read the map.
    
    queue[first] = (X+1) * leadDim + Y + 1;
    map[(X+1) * leadDim + Y + 1] = 0;
    
    // Very simple optimized process loop.
    while (first < last) 
    {
        int current = queue[first];
        int step = map[current];
    
        for (i = 0; i < 4; ++i)
        {
            int temp = current + movex[i] * leadDim + movey[i];
            if (map[temp] == -2) // only one condition in internal loop.
            {
                map[temp] = step + 1;
                queue[last++] = temp;
            }
        }
    
        ++first;
    }
    
    free(queue);
    
    return map;
}

The code may seem tricky. And of course, it is not object oriented (OOP) but if you want something really fast that's what you need.

priority queue isn't O(1)? I was hoping for descriptions of marching or sweeping algorithms — Will, Nov 14 '11 at 10:52
There are many priority queues data structures, designed for different needs. The easiest one called Binary Heap, has O(log n) asymptotic for add, delete, priority increase, priority decrease operations. Some new approaches (such as Fibonacci and Relaxed Heap) decrease asymptotic of some of operations (under certain conditions) to O(1) but they are pretty complex and has significant constant. Simple FIFO queue you need there already do push/pop in O(1). And nature of this task guarantees the validity of solution based on FIFO. Still nothing to optimize using algorithm theory. — Wisdom's Wind, Nov 14 '11 at 11:07
a friend pointed me at http://stackoverflow.com/questions/7426136/fastest-available-algorithm-for-distance-transform — Will, Nov 14 '11 at 11:20
Interesting topic, but tasks they investigating is more complex than your. For your simple case you can use more simple (and effective) solution. — Wisdom's Wind, Nov 14 '11 at 12:33
malloc size should involve the size of an int instead of only the count of those integers. Also, the index for point (x, y) should be padded with the border, so the correct index should be (y + 1) * linewidth + x + 1. — RockingDice, Apr 29 '23 at 06:44

score 1 · Answer 2 · answered Nov 14 '11 at 11:23

It's common task for BFS. Complexity is O(cellsCount)

My c++ implementation:

vector<vector<int> > GetDistance(int x, int y, vector<vector<int> > cells)
{
    const int INF = 0x7FFFFF;
    vector<vector<int> > distance(cells.size());
    for(int i = 0; i < distance.size(); i++)
        distance[i].assign(cells[i].size(), INF);
    queue<pair<int, int> > q;

    q.push(make_pair(x, y));
    distance[x][y] = 0;

    while(!q.empty())
    {
        pair<int, int> curPoint = q.front();
        q.pop();
        int curDistance = distance[curPoint.first][curPoint.second];
        for(int i = -1; i <= 1; i++)
            for(int j = -1; j <= 1; j++)
            {
                if( (i + j) % 2 == 0 ) continue;
                pair<int, int> nextPoint(curPoint.first + i, curPoint.second + j);
                if(nextPoint.first >= 0 && nextPoint.first < cells.size()
                   && nextPoint.second >= 0 && nextPoint.second < cells[nextPoint.first].size()
                   && cells[nextPoint.first][nextPoint.second] != BARRIER
                   && distance[nextPoint.first][nextPoint.second] > curDistance + 1)
                   {
                       distance[nextPoint.first][nextPoint.second] = curDistance + 1;
                       q.push(nextPoint);
                   }                    
            }
    }
    return distance;
}

score 0 · Answer 3 · answered Oct 09 '16 at 17:29

8-bit computers in the 1970s did this with an optimization that has the same algorithmic complexity, but in the typical case is much faster on actual hardware.

Starting from the initial square, scan to the left and right until "walls" are found. Now you have a "span" that is one square tall and N squares wide. Mark the span as "filled," in this case each square with the distance to the initial square.

For each square above and below the current span, if it's not a "wall" or already filled, pick it as the new origin of a span.

Repeat until no new spans are found.

Since horizontal rows tend to be stored contiguously in memory, this algorithm tends to thrash the cache far less than one that has no bias for horizontal searches.

Also, since in the most common cases far fewer items are pushed and popped from a stack (spans instead of individual blocks) there is less time spent maintaining the stack.

score 0 · Answer 4 · answered Nov 14 '11 at 10:58

0

Start with a recursive implementation: (untested code)

 int visit( int xy, int dist) {
    int ret =1;
    if (array[xy] <= dist) return 0;
    array[xy] = dist;
    if (dist == maxdist) return ret;
    ret += visit ( RIGHT(xy) , dist+1);
    ... 
    same for left, up, down
    ...
    return ret;
    }

You'l need to handle the initalisation and the edge-cases. And you have to decide if you want a two dimentional array or a one dimensonal array.

A next step could be to use a todo list and remove the recursion, and a third step could be to add some bitmasking.

answered Nov 14 '11 at 10:58

wildplasser

43,142
8
66
109

1

BFS is better than DFS to that task – Ivan Bianko Nov 14 '11 at 11:09
This is a good example of bad solution. At first it recursive, while there is a clear and fast solution without recursion. Excessive recursive calls will slow down performance. Second, it suffer from stack overflow for big matrices. – Wisdom's Wind Nov 14 '11 at 11:11
Recursion isn't always slow. In addition, it is often much easier to prove properties about a program when it utilizes recursion. – I GIVE CRAP ANSWERS Nov 14 '11 at 11:16
IMHO the first step should always be simple. As I said: next step should be to use a todo list / queue / heap / whatever. – wildplasser Nov 14 '11 at 11:25
I'm all for simple, and as I say in the question I've got simple working. I'm really after the cunning clever marching algorithms. – Will Nov 14 '11 at 11:29
BTW: the locality of reference might even be better, visiting {right,left} before {up,down}. – wildplasser Nov 14 '11 at 11:30
1

"Recursion isn't always slow." Depends what you will use to compare. Good straight implementation of the same thing always better than recursion. But of course not always easier to understand and faster to code. And actually this code is not clear enough. If you really meant DFS, it's completely useless for this task. In fact it will take exponential time. – Wisdom's Wind Nov 14 '11 at 12:39

How many moves to reach a destination? Efficient flood filling

4 Answers4

Linked