Time Complexity of the Kruskal Algorithm?

Question

I am calculating time complexity for kruskal algorithm like this (Please see the algorithm in the Image Attached)

T(n) = O(1) + O(V) + O(E log E) + O(V log V)
     = O(E log E) + O(V log V)
as |E| >= |V| - 1
T(n) = E log E + E log E
     = E log E

The CLRS Algorithm:

enter image description here

Is it correct or I'm doing something wrong please tell.

lines 5-9 will not be VlogV; The loop runs for each edge, so it will be E*(something). findset and union takes logV time, overall 5-9 should take E.logV time. So overall complexity comes : O(ElogE + ElogV). Since E can be atleast O(V) and atmost O(V^2); term ElogE is always greater than or equal to ElogV; thus overall its O(ElogE). — SpawN, Oct 05 '20 at 05:21
@SpawN sorry I did not get this part : " Since E can be at-least O(V) and at-most O(V^2) " I can understand E can be atLeast V - 1 time why at-most V^2 ? And why we are considering this "ElogE is always greater than or equal to ElogV?" — MrSham, Oct 07 '20 at 06:40
@MrA for v vertices maximum edges can be v*(v-1)/2 and that is O(V^2). And v vertices requires at least (v-1) edges(connected graph) hence O(V). — darkexodus, Oct 26 '21 at 05:06

score 22 · Answer 1 · answered Dec 06 '13 at 20:29

22

Kruskal is O(E log E); your derivation is right. You could also say O(E log V) because E <= V * V, so log(E) <= 2 log(V) (I don't know why I remember that, other than that I think a prof put that on an exam at one point...)

answered Dec 06 '13 at 20:29

Bandrami

4,242
1
13
12

1

But why is O(E log V) preferable than O(E log E)? Is it because usually graphs complexities use both E and V in the formula? – Giacomo Pigani Jun 12 '18 at 20:38
2

Elog(V) can be reduced here to Elog(V^2) [ e = v^2 in worst case // complete graph ] So Elog(V^2) <= 2Elog(V) i.e. Elog(V). Generally in graphs we use both e and v cause we cannot replace e by v cause that would change the time complexity but here due to presence of log we can. PS: We can't write O(E*E) as O(V*E) just to have both the variables. – nutNCracker Nov 29 '18 at 15:37
1

You can even use counting sort to reduce the sorting time. This won't make an overall difference but might be useful during implementation. – Inderpartap Cheema Apr 11 '20 at 23:53

score 6 · Answer 2 · edited Dec 17 '16 at 19:21

6

Since |V| > |E|+1, we prefer a tight upper bound with V terms instead of E terms.

    |E| <= |V|²   
.   log |E| < log |V|²   
.   log |E| < 2 log |V| 
.   running time of MST-KRUSKAL is: O(E log V)

edited Dec 17 '16 at 19:21

Karim-53

135
1
5

answered Nov 04 '14 at 20:16

MOHIT KUMAR

61
1
2

The graph is assumed to be connected, so shouldn't it be |V| <= |E| + 1, since at minimum, you have a tree which has property |V| - 1 = |E| => |V| = |E| + 1. By adding more edges, you arrive at a graph with cycles, so |V| <= |E| + 1. – M.M Apr 11 '17 at 17:05
Also, for those that may not immediately understand |E| <= |V|², max number of edges occurs in a complete graph where every vertex has an edge to all other vertices. A complete graph has (|V| choose 2) edges which is less than |V|². – M.M Apr 11 '17 at 17:08

score 2 · Answer 3 · answered Apr 05 '15 at 17:32

2

Sorry for the late reply.
Runtime for Kruskal algorithm is O(E log E) and not O(E log V).

As, the edges have to be sorted first and it takes O(E log E) where it dominates the runtime for verifying whether the edge in consideration is a safe edge or not which would take O( E log V). And |E| > |V|((corner case if the graph is already a tree)), so its safe to assume the runtime is O(E log E)

answered Apr 05 '15 at 17:32

letsBeePolite

2,183
1
22
37

3

Right, but E cannot exceed V * V, which means log(E) < 2 log(V). – Bandrami Apr 10 '15 at 06:36
Yes, I too found the same statement in the CLRS. And I agree too that also , but I feel O(ElogE) would be a more tighter upper bound. Bottom line: O(ElogE) and O(ElogV) should work – letsBeePolite Apr 10 '15 at 19:34
Any reference for this? – FaCoffee May 17 '16 at 13:16
Reference:https://people.cs.umass.edu/~barring/cs611/lecture/7.pdf -- I havent verified completely. Just skimmed and was convinced. So, please go through with a pinch of salt – letsBeePolite May 17 '16 at 17:56
@Bandrami don't you think that opposite also comes true. E is in between O(V) and O(V*V). So, if you try to compare logE with logV, E is atleast O(V), so won't that make logE > log V ? That is why I feel O(ElogE + ElogV) = O(ElogE); but still, I cannot not agree with E<=V*V so logE<=2logV; this is a opposite result, and logicwise both looks correct, I don't know, both logic sounds correct, yet results different. – SpawN Oct 05 '20 at 05:45

score 2 · Answer 4 · answered Dec 05 '20 at 00:34

2

O(ElogE) is definitely O(ElogV) because E <= V^2 (fully connected graph)

ElogE <= Elog(V^2) = 2ElogV = O(ElogV)

answered Dec 05 '20 at 00:34

Adam Zawierucha

21
1

score 1 · Answer 5 · answered Mar 25 '19 at 07:23

All other answers are correct, but we can consider the following case, that gives us the time complexity of O(|E|).
The following answer is from Algorithms book by Dasgupta, chapter 5, page 140, section path compression:
In the time complexity computation of this algorithm, the dominant part is the edge sorting section which is O(|E| log|E|) or as all other answers explained O( |E|. log|V|).

But, what if the given edges are sorted?
Or if the weights are small (say, O(|E|)) so that sorting can be done in linear time (like applying counting sort).
In such a case, the data structure part becomes the bottleneck (the Union-find) and it is useful to think about improving its performance beyond log n per operation. The solution is using the path-compression method, while doing the find() operation.

This amortized cost turns out to be just barely more than O(1), down from the earlier O(log n). For more details please check this reference. The brief idea is, whenever the find(v) operation is called to find the root of a set which v belongs to, all the nodes' links to their parent will be changed and will point out to the root. This way if you call find(x) operation on each node x on the same path, you will get the set's root (label) in O(1). Hence, in this case, the algorithm bottleneck is the Union-find operation and using the described solution it is O(1), the running time of this algorithm in the described situation is O(|E|).

score 0 · Answer 6 · answered Feb 12 '18 at 16:33

0

line 5 to 9 the complexity is O(E).

O(E)
O(1)
O(1)
O(1)
O(1)

till the line 5 you have calculated the complexity rightly. Finally, the Dominating factor here is O(E lg E). So, the complexity is O(E lg E)

answered Feb 12 '18 at 16:33

Zubayear

1

Time Complexity of the Kruskal Algorithm?

6 Answers6

Linked