Consider the following two implementations of dijkstra's algorithm:
Using set:
// V - total number of vertices
// S - source node
void dijkstra(int V, vector<vector<int>> edge[], int S)
{
vector<int> dist(V, 1e9);
dist[S] = 0;
set<pair<int, int>> s;
for (int i=0; i<V; i++)
s.insert({dist[i], i}); // log(V) time
while (!s.empty()) // exactly V times
{
auto top = *(s.begin()); // constant time
int dis = top.first;
int node = top.second;
s.erase(top); // amortised constant time
for (auto it: edge[node]) // For all the iterations of outer while loop this will sum up to E, where E is the total number of edges in the graph
{
int nb = it[0];
int edge_weight = it[1];
if (dist[nb] > dis + edge_weight)
{
s.erase({dist[nb], nb}); // log(V) time
dist[nb] = dis + edge_weight;
s.insert({dist[nb], nb}); // log(V) time
}
}
}
}
Using priority queue:
// V - total number of vertices
// S - source node
void dijkstra(int V, vector<vector<int>> edge[], int S)
{
vector<int> dist(V, 1e9);
dist[S] = 0;
priority_queue<pair<int, int> , vector<pair<int, int>>, greater<pair<int, int>>> pq;
pq.push({dist[S], S});
while (!pq.empty()) // Can be more than V times, let's call this heap_size times
{
int node = pq.top().second; // O(1) time
pq.pop(); // log(heap_size) time
for (int i=0; i<edge[node].size(); i++) // Can be approximated to (E/V), where E is the total number of edges in the graph
{
int nb = edge[node][i][0];
int edge_weight = edge[node][i][1];
if (dist[nb] > dist[node] + edge_weight) {
dist[nb] = dist[node] + edge_weight;
pq.push({dist[nb], nb}); // log(heap_size) time
}
}
}
return dist;
}
Finding the time complexity using set based approach is easy as the number of elements in set is exactly V
(number of vertices) and the inner for loop runs for every edge, so it's time complexity is O(V*log(V) + V + E*log(V))
which is equivalent to O(E*log(V))
(reason: What's the time complexity of Dijkstra's Algorithm)
But I have trouble in understanding the time complexity of the priority_queue approach. Here the same node can be present multiple times in the priority_queue with different distances. How do I calculate the upper bound on the number of nodes that are added to the heap?
I also want to decide which implementation to use based on the nature of the graph(sparse vs dense), are both these implementations equivalent for any type of graph?