When can we have duplicate nodes in the heap in dijikstras algorithm?

Question

So this question came to my mind when solving some dijikstra's based leetcode problems. We have node, distance pairs in priority queue at each step.

Having duplicate nodes in the heap depends on one thing i.e when we mark the node as visited(i.e confirming that we have found the shortest length for it). If we mark it while pushing into the queue we will not have any duplicates, if we mark after popping from the queue, we may have duplicates in the queue.

https://leetcode.com/problems/network-delay-time/ In this question we can mark the node as visited only after popping from the priority Queue, or we will miss some edge that may lead to a shorter path. Ex: [[1,2,1],[2,3,2],[1,3,4]] 3 1

If we add while inserting we will get wrong answer while exploring 1's neighbors what we are doing is , 1->2 queue={2,1} visited={1,2} 1->3 queue{(2,1), (3,4)} since all nodes are now visited, we will never encounter the path 1->2->3 distance=1+2=3.

But in other questions we can do a dijikstra with visited marked before the insertion into the priority queue, ex: https://leetcode.com/problems/swim-in-rising-water/

why is dijikstra with visited marked before the insertion valid here

It's [Dijkstra](https://en.wikipedia.org/wiki/Edsger_W._Dijkstra) by the way — harold, May 10 '22 at 19:58
I have a feeling that the rising water problem is solved with [a BFS, not Dijkstra's](https://stackoverflow.com/questions/25449781/what-is-difference-between-bfs-and-dijkstra). The two are very similar. In particular, simple BFS is faster if all edges have the same weight. — user3386109, May 10 '22 at 20:18
@user3386109 you are right it's just bfs with a priority Queue. — BosssMan861, May 11 '22 at 08:42

tanmoy · Answer 1 · 2022-05-17T06:40:55.800

BFS is for blindly visiting nodes (may assume all weight 1), Dijkstra will prioritize with the least weighted path.

When can we have duplicate nodes in the heap in Dijkstra algorithm?

      a
   4/   \2
   /     \
   b ---- c   
   |  1 
 4 |
   d

1. here start Dijkstra from a.                   queue = (a, 0)

2. b, c pushed with path cost into p_queue.      queue = (b, 4), (c, 2)

3. c popped. b with another cost pushed.         queue = (b, 4), (b, 3)  here the new (b, 3) has (ac + cb) cost

4. least b popped. d pushed                      queue = (b, 4), (d, 7)

5. As we check and mark after pop. b popped.     queue = (d, 7)

6. But already visited so returned

7. Process d

But in other questions we can do a Dijkstra with visited marked before the insertion into the priority queue, ex: https://leetcode.com/problems/swim-in-rising-water why is Dijkstra with visited marked before the insertion valid here?

Depends largely on the problem itself. As for this particular problem, we get weight directly from the node, and no matter whether we push it before or after, it will be popped at the same time, though keeping visited checking before push will prevent redundant pushing.

Here is my accepted implementation where you can comment out or keep either after pop or before push marking & checking for visited nodes and both will get accepted.

class Solution {
    struct PosWeight {
        pair<int, int> pos;
        int weight;
        PosWeight(pair<int, int> a, int weight): pos(a), weight(weight){}
    };
    
    int visited[51][51] = {0};
    int traverse(vector<vector<int>>& grid) {
        int row = size(grid);
        int column = size(grid[0]);
        vector<pair<int, int>> directions = { {0,1}, {1,0}, {-1,0}, {0,-1} };

        auto isValidTo = [&grid, row, column]
            (pair<int, int> direction, pair<int, int> pos) {
      
            if (pos.first + direction.first >= row) return false;
            if (pos.first + direction.first < 0) return false;
            
            if (pos.second + direction.second >= column) return false;
            if (pos.second + direction.second < 0) return false;
            return true;
        };
            
        auto comp = [](PosWeight &a, PosWeight &b) {
            return a.weight > b.weight;
        };
        
        int maxx =INT_MIN;
        
        priority_queue<PosWeight, vector<PosWeight>, decltype(comp)> pq(comp);
        pq.emplace(make_pair(0,0), grid[0][0]);
        
        while(!pq.empty()) {
            auto elem = pq.top();
            pq.pop();
            
            // You can comment out this portion and keep the later one before pushing in queue
            if (visited[elem.pos.first][elem.pos.second]) continue;
            visited[elem.pos.first][elem.pos.second] = 1;
            // ---
            
            maxx = max(maxx, elem.weight);
            if (elem.pos.first == row - 1 && elem.pos.second == column - 1) 
                return maxx;
            
            for(auto direction: directions)
                if (isValidTo(direction, elem.pos)) {
                    pair<int,int> toGo = make_pair( direction.first + elem.pos.first, 
                                  direction.second + elem.pos.second );
                    auto weight = grid[toGo.first][toGo.second];
                    
                    // You can comment out this portion and keep the later one before pushing in queue
                    // if (visited[toGo.first][toGo.second]) continue;
                    // visited[toGo.first][toGo.second] = 1;
                    // -----
                    
                    pq.emplace(toGo, weight);
                }
        }
        
        return maxx;
    }
    
public:
    int swimInWater(vector<vector<int>>& grid) {
        return traverse(grid);
    }
};

But, for https://leetcode.com/problems/network-delay-time this problem, the checking must be after pop as doing so before push will cause early all node visits instead of prioritized shortest as you stated in your question.

class Solution {
public:
    int networkDelayTime(vector<vector<int>>& times, int n, int k) {
        
        auto comp = [](pair<int,int> &a, pair<int,int> &b) {
            return a.second > b.second;
        };
        
        priority_queue<pair<int,int>, vector<pair<int,int>>, decltype(comp)> que(comp);
        vector<vector<int>> rel(n, vector<int>(n, -1));
        
        for (auto &time: times)
            rel[time[0] - 1][time[1] - 1] = time[2];
        
        vector<int> visit(n, 0);
        que.push({k-1, 0});
        
        int sz = n;
        while (size(que)) {
            auto now = que.top();
            que.pop();
            
            if (visit[now.first]) continue;
            visit[now.first] = 1;
            if (!--sz) return now.second;
            
            auto id = now.first, val = now.second;
            for (int i = 0; i < n; ++i)
                if (rel[id][i] != -1) {
                    // cant do checking here
                    que.push({i, val + rel[id][i]});
                }
        }
        return -1;
    }
};

So, bottom line, it depends on the nature and requirement of the problem.

When can we have duplicate nodes in the heap in dijikstras algorithm?

1 Answers1