11

Question from Cracking the Coding Interview

In the image I do not understand where the extra O(s) comes from when the array of strings are being sorted. I get that sorting the array of string will be O(a log a), I can't understand why we have to add in O(s) as well.

In my mind, O(a log a) has taken care of sorting off all the strings in the array of strings.

MrBuggy
  • 423
  • 4
  • 12
  • Think about how log it takes to compare two length `s` strings. That's an `O(s)` operation. There are `O(a log a)` _comparisons_ but each comparison takes `O(s)` time. – mrmcgreg Nov 19 '16 at 22:46
  • There's an error in this image -- sorting a string of length s takes O(s) time, not O(s log s) as the text suggests (making the reasonable assumption that a string is a sequence of elements drawn from a finite alphabet, and that you choose an appropriate sorting algorithm like quicksort rather than a guaranteed Theta(s log s) algorithm like mergesort. – Paul Hankin Nov 19 '16 at 23:01
  • It's [arguable](http://www.bbc.co.uk/programmes/b07zyl4b) whether sorting a string is O(n log n) or just O(n) if indeed the string of characters is known in advance to be sorted. Regardless, the question asked was about "why add", and not what is the big O of sorting a string. In either cases, a "+" is needed to state the final running cost of the posted problem. – mohsenmadi Nov 19 '16 at 23:49
  • More on why sorting a string is indeed O(n log n), see the Accepted Answer of [this link](http://stackoverflow.com/questions/34318489/is-it-true-that-sorting-strings-is-on2logn/34318565) which explains the logic behind it. – mohsenmadi Nov 20 '16 at 00:07
  • 1
    @mohsenmadi I don't see the explanation of why sorting a single string is O(n log n) in either of your two links (and one of them looks like a copy-paste error -- it's to an unrelated bbc radio news story). You say the fact is arguable, but a https://en.wikipedia.org/wiki/Counting_sort sorts a string in O(n) time (as does quicksort). I agree it's not the original question, but I did think the error in the original document was worth noting in a comment. – Paul Hankin Nov 20 '16 at 02:40
  • Sorry about that! Here is what it should have been. It's [arguable](http://stackoverflow.com/questions/4433915/why-is-sorting-a-string-on-log-n) whether sorting a string is O(n log n) or just O(n) if indeed the string of characters is known in advance to be sorted. Regardless, the question asked was about "why add", and not what is the big O of sorting a string. In either cases, a "+" is needed to state the final running cost of the posted problem. – mohsenmadi Nov 20 '16 at 03:02
  • Your reference talks about "integers", integers are different that "strings" of characters. My second comment (the one starting with "More on why...") addresses exactly this point. I have taken it into consideration from the beginning. – mohsenmadi Nov 20 '16 at 03:11

4 Answers4

8

Got stuck on the same example! Remember that optimally it takes nlogn time to sort an array of n characters. For the final sort if we assume that each string in the array is of length 1 then we're again just sorting a characters so we get the aloga term, however the worst case length of each string is s so you need to do aloga comparisons s times.

sabujp
  • 989
  • 1
  • 11
  • 12
6

In the image you ask "why add?" Well, they are independent operations, one that sorts each of a strings the length of each is s, and that's O(a * s log s), and one that sorts the array of a strings, the length of each is s to count potential comparisons between each two strings, that's another O(a * s log a). Independent operations means "add". Adding gives O(a * s log s + a * s log a), which is what you got when you extract out the common factors.

mohsenmadi
  • 2,277
  • 1
  • 23
  • 34
  • 2
    Upvoted, but your answer contains a small error. "s" is given as the length of the longest string and not the length of each string. Of course, that doesn't change the final result since big-O is an upper bound, so assuming each string is as long as the longest string is fine. – Paul Hankin Nov 20 '16 at 02:46
  • Thanks. You said it, `s` here is an "upper bound", so assuming that each string will be as lengthy as the longest is the norm in big O calculations. There are others (e.g., omega, theta) for lower bound estimations. – mohsenmadi Nov 20 '16 at 03:07
3

Understand in such a way that when you're sorting an array of characters/numbers, you can sort that array based on simple comparisons of two index elements and complexity will be O(N*log(N)) where N is the length of array.

What happens when you start sorting an array of string?

array[0] = "rahul" and array[1]= "raj"

If you have to sort the above two indexes lexicographically (you can't simply compare two strings like numbers), you need to compare character wise. So it will run Math.max(array[0].length(), array[1].length()) times. From here that extra s is coming in O(s*a log(a))

Rahul Raj
  • 55
  • 2
  • 7
-1

enter image description hereMost clear and to the point explanation when to use "+(Add) and When to use *(Multiply)

Sheetal Shinde
  • 489
  • 5
  • 7