0

So I have a function that looks at a grid (2D array) and finds all the paths from the starting point to the end point. So far the algorithm works as intended and I get the values that I'm looking for.

The problem is that it takes forever. It can run over a 100 x 100 grid no problem, but once I get to a 10000 x 10000 grid, it'll take about 10 min to give back an answer, where I'm looking for maybe 1 min at most.

Here's what it looks like right now:

public void BFS(Point s, Point e){
        /**
         * North, South, East, West coordinates
         */
        int[] x = {0,0,1,-1};
        int[] y = {1,-1,0,0};

        LinkedList<Point> queue = new LinkedList<>();
        queue.add(s);

        /**
         * 2D int[][] grid that stores the distances of each point on the grid
         * from the start
         */
        int[][] dist = new int[numRow][numCol];
        for(int[] a : dist){
            Arrays.fill(a,-1);
        }
        /**
        * "obstacles" is an array of Points that contain the (x, y) coordinates of obstacles on the grid
        * designated as a -2, which the BFS algorithm will avoid.
        */
        for(Point ob : obstacles){
            dist[ob.x][ob.y] = -2;
        }
        // Start point
        dist[s.x][s.y] = 0;

        /**
         * Loops over dist[][] from starting point, changing each [x][y] coordinate to the int
         * value that is the distance from S.
         */
        while(!queue.isEmpty()){                                        
            Point p = queue.removeFirst();
            for(int i = 0; i < 4; i++){                             
                int a = p.x + x[i];
                int b = p.y + y[i];
                if(a >= 0 && b >= 0 && a < numRow && b < numCol && dist[a][b] == -1){
                    dist[a][b] = 1 + dist[p.x][p.y];
                    Point tempPoint = new Point(a, b);
                    if(!queue.contains(tempPoint)){
                        queue.add(tempPoint);
                    }
                }
            }
        }

        /**
         * Works backwards to find all shortest path points between S and E, and adds each
         * point to an array called "validPaths"
         */
        queue.add(e);
        while(!queue.isEmpty()){
            Point p = queue.removeFirst();

            // Checks grid space (above, below, left, and right) from Point p
            for(int i = 0; i < 4; i++){
                int curX = p.x + x[i];
                int curY = p.y + y[i];

                // Index Out of Bounds check
                if(curX >= 0 && curY >= 0 && !(curX == start.x && curY == start.y) && curX < numRow && curY < numCol){
                    if(dist[curX][curY] < dist[p.x][p.y] && dist[curX][curY] != -2){ // -2 is an obstacle
                        Point tempPoint = new Point(curX, curY);
                        if(!validPaths.contains(tempPoint)){
                            validPaths.add(tempPoint);
                        }
                        if(!queue.contains(tempPoint)){
                            queue.add(tempPoint);
                        }
                    }
                }
            }
        }

So again, while it works, it's really slow. I'm trying to get a O(n + m), but I believe that it might be running in O(n^2).

Does anyone know any good ideas to make this faster?

ESM
  • 175
  • 10

1 Answers1

2

An clear reason for the observed inefficiency are the comparisons !validPaths.contains(tempPoint) and !queue.contains(tempPoint) which are both O(n). To do these comparisons you should be striving for an O(1) comparison, which can be accomplished by using a special datastructure such as a hash-set or simply a bitset.

As it stands now, your implementation is clearly O(n^2) because of these comparisons.

ldog
  • 11,707
  • 10
  • 54
  • 70
  • Thanks for the tip. I changed it to a hashSet like you said, and got rid of the `!queue.contains(tempPoint)` in both places as it was redundant (the if statement already does a check). And I can run a 10000 x 10000 with a start point of (1, 2) and an end point of (4000, 3000) in about 5 min. Unfortunately I run out of heap space if I do something like (1, 2) and (9500, 9500), but I don't know how to further help that issue. – ESM Sep 30 '20 at 20:34