0

What is the cost of the following method. How do you calculate it?

public String joinWords(String[] words) {
    String sentence = "";
    for (String w : words) {
        sentence = sentence + word;
    }
    return sentence;
}
Holger
  • 285,553
  • 42
  • 434
  • 765
redandblue
  • 750
  • 1
  • 5
  • 13
  • Optimal string concat would be `O(1)`. So you can assume inside the for loop it's `O(1)`, making the whole thing `O(n)` where `n={size of words}` – mash Jun 08 '14 at 22:17
  • 1
    It is either amortized O(n), or O(n²), depending on details of the `String` implementation. Since you haven't even told us which programming language this is, never mind which String you are using, we cannot be more specific. – zwol Jun 08 '14 at 22:18
  • @Mash The best case for concatenating two strings is O(n), not O(1), because you must at a minimum read each character of both strings and write them all to new memory locations. – zwol Jun 08 '14 at 22:19
  • @Zack: Not necessarily; one could imagine an implementation where the target string is just a list of pointers to individual components. (However, this is clearly Java code, so that's not the case!) – Oliver Charlesworth Jun 08 '14 at 22:22
  • @OliCharlesworth I've seen libraries like that, but they usually hesitate to call themselves _string_ libraries (the preferred term seems to be "rope"), and come with dire warnings about pathological corner-case behavior... – zwol Jun 08 '14 at 22:40
  • Possible duplicate of [How to find time complexity of an algorithm](https://stackoverflow.com/questions/11032015/how-to-find-time-complexity-of-an-algorithm) –  Oct 09 '19 at 18:56

1 Answers1

5

Assuming that the cost of string concatenation is O(r + s) for two strings of lengths r and s - which historically was the case in Java but may change going forward - the runtime would be O(mn), where n is the total number of characters in the input strings and m is the number of input strings.

To see this, note that if the string lengths are n1, n2, ..., n_m, then the runtime will be

n1 + (n1 + n2) + (n1 + n2 + n3) + ... + (n1 + n2 + ... + n_m)

= m(n_1) + (m - 1)(n_2) + (m - 2)(n-3) + ... + n_m

= m(n_1 + n_2 + ... + n_m) - n_2 - 2n_3 - 3n_4 - ... - (m - 1)n_m

Subject to the constraint that n_1 + ... + n_m = n, this is maximized when n_1 = n and all the other values are 0. In that case, the runtime becomes mn, so the runtime is O(mn).

Community
  • 1
  • 1
templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • 1
    Thanks for pointing that out! I believe that this answer predates the feature change that you linked here - I didn’t realize that the Java folks were working on revising this! – templatetypedef Oct 09 '19 at 18:57
  • 1
    There seems to be a huge misunderstanding. The linked change improved the performance and reduced code size, but doesn’t change the overall time complexity. As long as a string is defined as an immutable random access sequence of characters, there is no way around the O(r + s) time complexity, as that’s not specific to the concatenation but the general cost of constructing a new string of length *n* (where in case of concatenation, n = r + s) due to the defensive copying into a new array. Though you could easily avoid this by using `return String.join("", words);` here. Which would be O(n). – Holger Jun 01 '23 at 16:30