The problem with saying: Sorting is O(n logn) is that this is an incomplete and therefore nebulous, undefined statement. It is unclear what it means. The problem is: You have failed to define what n
is.
Now, in practice, n
is usually obvious from context. For example, If I say: "Sorting an array of integers is O(n logn) assuming an optimal sorting algorithm", just about everybody who hears this will immediately realize that, whilst I didn't explicitly say what n
is, clearly I mean: n = the number of ints in that array.
So let's go back to your problem: Now n is no longer clear.
Is n the # of strings in the array? Is n the average # of characters for all strings in that array?
We're really talking about 2 separate variables here. So, to properly describe the performance characteristics in terms of big-O notation for this problem, you'd say something like:
- Given:
String[] str
, for example: str = {"Hello", "World", "John", "Doe"}
- Let
n
be the number of elements in the str
array.
- Let
m
be the average # of characters for the strings in the str
array
- Then, calling
Arrays.sort(str)
on this array would have the performance characteristic of O(m * n log n)
.
The reason it's O(m*n logn) is because the sorting algorithm itself will run O(n logn)
comparison operations in order to perform the sort. And each comparison operation takes O(m)
time to finish, though that is very much the worst case scenario: Given, say, Hello
and World
, even though m = 5
for those two, the comparison completes after only 1 step; the algorithm compares H
and W
and answers immediately, never even looking at ello
or orld
. But if the strings are Hello1
, Hello2
, Hello3
and Hello4
, then that's a guaranteed 5 'steps' for every time 2 strings are compared, and there will be O(n logn)
comparisons.