30

I am calculating time complexity for kruskal algorithm like this (Please see the algorithm in the Image Attached)

T(n) = O(1) + O(V) + O(E log E) + O(V log V)
     = O(E log E) + O(V log V)
as |E| >= |V| - 1
T(n) = E log E + E log E
     = E log E

The CLRS Algorithm:

enter image description here

Is it correct or I'm doing something wrong please tell.

Sonali
  • 631
  • 3
  • 9
  • 11
  • Please tell me the complexity of line 4 and line 5-9 – Sonali Dec 08 '13 at 08:31
  • 5
    lines 5-9 will not be VlogV; The loop runs for each edge, so it will be E*(something). findset and union takes logV time, overall 5-9 should take E.logV time. So overall complexity comes : O(ElogE + ElogV). Since E can be atleast O(V) and atmost O(V^2); term ElogE is always greater than or equal to ElogV; thus overall its O(ElogE). – SpawN Oct 05 '20 at 05:21
  • @SpawN sorry I did not get this part : " Since E can be at-least O(V) and at-most O(V^2) " I can understand E can be atLeast V - 1 time why at-most V^2 ? And why we are considering this "ElogE is always greater than or equal to ElogV?" – MrSham Oct 07 '20 at 06:40
  • @MrA for v vertices maximum edges can be v*(v-1)/2 and that is O(V^2). And v vertices requires at least (v-1) edges(connected graph) hence O(V). – darkexodus Oct 26 '21 at 05:06

6 Answers6

22

Kruskal is O(E log E); your derivation is right. You could also say O(E log V) because E <= V * V, so log(E) <= 2 log(V) (I don't know why I remember that, other than that I think a prof put that on an exam at one point...)

Bandrami
  • 4,242
  • 1
  • 13
  • 12
  • 1
    But why is O(E log V) preferable than O(E log E)? Is it because usually graphs complexities use both E and V in the formula? – Giacomo Pigani Jun 12 '18 at 20:38
  • 2
    Elog(V) can be reduced here to Elog(V^2) [ e = v^2 in worst case // complete graph ] So Elog(V^2) <= 2Elog(V) i.e. Elog(V). Generally in graphs we use both e and v cause we cannot replace e by v cause that would change the time complexity but here due to presence of log we can. PS: We can't write O(E*E) as O(V*E) just to have both the variables. – nutNCracker Nov 29 '18 at 15:37
  • 1
    You can even use counting sort to reduce the sorting time. This won't make an overall difference but might be useful during implementation. – Inderpartap Cheema Apr 11 '20 at 23:53
6

Since |V| > |E|+1, we prefer a tight upper bound with V terms instead of E terms.

    |E| <= |V|²   
.   log |E| < log |V|²   
.   log |E| < 2 log |V| 
.   running time of MST-KRUSKAL is: O(E log V)
Karim-53
  • 135
  • 1
  • 5
MOHIT KUMAR
  • 61
  • 1
  • 2
  • The graph is assumed to be connected, so shouldn't it be |V| <= |E| + 1, since at minimum, you have a tree which has property |V| - 1 = |E| => |V| = |E| + 1. By adding more edges, you arrive at a graph with cycles, so |V| <= |E| + 1. – M.M Apr 11 '17 at 17:05
  • Also, for those that may not immediately understand |E| <= |V|², max number of edges occurs in a complete graph where every vertex has an edge to all other vertices. A complete graph has (|V| choose 2) edges which is less than |V|². – M.M Apr 11 '17 at 17:08
2

Sorry for the late reply.
Runtime for Kruskal algorithm is O(E log E) and not O(E log V).

As, the edges have to be sorted first and it takes O(E log E) where it dominates the runtime for verifying whether the edge in consideration is a safe edge or not which would take O( E log V). And |E| > |V|((corner case if the graph is already a tree)), so its safe to assume the runtime is O(E log E)

letsBeePolite
  • 2,183
  • 1
  • 22
  • 37
  • 3
    Right, but E cannot exceed V * V, which means log(E) < 2 log(V). – Bandrami Apr 10 '15 at 06:36
  • Yes, I too found the same statement in the CLRS. And I agree too that also , but I feel O(ElogE) would be a more tighter upper bound. Bottom line: O(ElogE) and O(ElogV) should work – letsBeePolite Apr 10 '15 at 19:34
  • Any reference for this? – FaCoffee May 17 '16 at 13:16
  • Reference:https://people.cs.umass.edu/~barring/cs611/lecture/7.pdf -- I havent verified completely. Just skimmed and was convinced. So, please go through with a pinch of salt – letsBeePolite May 17 '16 at 17:56
  • @Bandrami don't you think that opposite also comes true. E is in between O(V) and O(V*V). So, if you try to compare logE with logV, E is atleast O(V), so won't that make logE > log V ? That is why I feel O(ElogE + ElogV) = O(ElogE); but still, I cannot not agree with E<=V*V so logE<=2logV; this is a opposite result, and logicwise both looks correct, I don't know, both logic sounds correct, yet results different. – SpawN Oct 05 '20 at 05:45
2

O(ElogE) is definitely O(ElogV) because E <= V^2 (fully connected graph)

ElogE <= Elog(V^2) = 2ElogV = O(ElogV)

1

All other answers are correct, but we can consider the following case, that gives us the time complexity of O(|E|).
The following answer is from Algorithms book by Dasgupta, chapter 5, page 140, section path compression:
In the time complexity computation of this algorithm, the dominant part is the edge sorting section which is O(|E| log|E|) or as all other answers explained O( |E|. log|V|).

But, what if the given edges are sorted?
Or if the weights are small (say, O(|E|)) so that sorting can be done in linear time (like applying counting sort).
In such a case, the data structure part becomes the bottleneck (the Union-find) and it is useful to think about improving its performance beyond log n per operation. The solution is using the path-compression method, while doing the find() operation. enter image description here

This amortized cost turns out to be just barely more than O(1), down from the earlier O(log n). For more details please check this reference. The brief idea is, whenever the find(v) operation is called to find the root of a set which v belongs to, all the nodes' links to their parent will be changed and will point out to the root. This way if you call find(x) operation on each node x on the same path, you will get the set's root (label) in O(1). Hence, in this case, the algorithm bottleneck is the Union-find operation and using the described solution it is O(1), the running time of this algorithm in the described situation is O(|E|).

Medi
  • 43
  • 5
0

line 5 to 9 the complexity is O(E).

  1. O(E)
  2. O(1)
  3. O(1)
  4. O(1)
  5. O(1)

till the line 5 you have calculated the complexity rightly. Finally, the Dominating factor here is O(E lg E). So, the complexity is O(E lg E)