-2

I'm exploring the time complexities of string concatenation. I've read on many websites that the time complexity of concatenating two strings using the + operator is O(n^2) (or O(mn) to be exact in regards to concatenating two different string variables). I'm wondering if this only occurs for a certain type of concatenation scheme (using the + operator). I'm contemplating whether:

void append1 (String s){
  
  //s literal is " world"

  String str = "hello";

  for (int i = 0; i < s.length; i++){
    str += s.charAt(i);
  }
}

has the same time complexity as:

void append2 (String s){
  
  //s literal is " world"

  String str = "hello";

  str += s;
}

Would this be true or false?

In append1(String s) I understand that a new concatenation occurs every new iteration, and thus I can see how the time complexity for appending to String str in append1(String s) is O(n^2) i.e. O(mn) in our case (being the old copies are always copied over to a new array that can fit the new additional to-be-added character). However, in append2(String s), is it still the case that a new array, with copying all previous characters into it, is undergone for each character in the RHS literal (which would entail that " world" is not just one concatenation but many for each charAt(x) in " world"?)? For:

str += " world"

Is it not the case that the compiler can detect the length of the RHS literal and directly assign the new array to be whatever the length of str is + the RHS literal right away, or is it unable to do so and just creates the new arrays whenever a new character is to be appended? If this is the case, doesn't this entail that creating any type of String is O(n^2) (if not, how can the compiler infer the length of RHS literal for String x = "hello" (which presumably is O(n)) but not x += " world")? Conclusively, should:

 str += " world"

be regarded as one or many concatenations? Thanks.

Zerenity
  • 107
  • 6
  • O(n^2)??? Please share any resource claiming that. And what is n in your case? See you just asking about appending a string of length n to itself? – luk2302 May 23 '23 at 18:38
  • "Is it not the case that the compiler can detect the length of the RHS literal and directly assign the new array to be whatever the length of str is + the RHS literal right away" Appending strings requires copying characters, not just calculating the length. – Code-Apprentice May 23 '23 at 18:44
  • 1
    @luk2302 https://interviews.school/strings https://stackoverflow.com/questions/15400508/string-concatenation-complexity-in-c-and-java https://plainenglish.io/blog/concatenating-strings-efficiently-in-python etc. Sorry, I should change it to O(mn) considering two diff input variables. – Zerenity May 23 '23 at 18:46
  • All those are basically intentionally bad, O(n^2) yes. And no, your second snippet obviously is O(n) or O(n+m) but not quadratic. – luk2302 May 23 '23 at 18:48
  • 2
    Stop trying to reason about what the compiler can infer or not, that is irrelevant. Adding anything to a string is linear in time, adding things n times therefore is quadratic, end of argument. – luk2302 May 23 '23 at 18:50
  • @Code-Apprentice I know, I'm wondering though whether a new array is created for each step in the RHS literal with copying of prev characters occurring for each new array or if one single array is created by inferring the length of RHS literal and copying of prev characters only occurring once, i.e., is it case 1 that happens (in java specifically) or case 2 for string concatenation? – Zerenity May 23 '23 at 18:54
  • 1
    @luk2302 Right, that's what I assumed too, so concatenating two strings in one line outside a loop is O(n+m) (as a new array is only formed once being that java can infer the length of the RHS literal?) while concatenating within a loop in a per iteration basis is O(nm) (as new arrays are formed every iteration?)? Edit: Thanks, your latest comment makes sense. – Zerenity May 23 '23 at 18:58
  • I think some compilers or JIT interpeters automatically convert repeated string concatenation to use `StringBuilder` in some cases, which amortizes reallocation? – Solomon Ucko May 23 '23 at 19:02
  • 1
    @SolomonUcko I read something about the StringBuilder automatically being invoked in the following forum discussion https://stackoverflow.com/questions/58309852/why-is-the-complexity-of-simple-string-concatenation-on2 – Zerenity May 23 '23 at 19:06
  • 2
    This (in append2) looks like a bug: `String str += s;` You're concatenating 's' onto an uninitialized variable. It's probably not even legal syntax; the context is 'initialization', not 'assignment'. – Arfur Narf May 23 '23 at 22:22

1 Answers1

3

Concatenating strings should be O(n). It will be roughly the following algorithm:

get the length of each string and add them together
allocate a new array of this length
copy each character from the first string into the new array buffer
copy each character from the second string int the new array buffer making sure to place them after the last character from the first string

The first 2 steps are "simple" operations and take constant time because they don't depend on the number of characters. This assumes that the length of each string is already stored somewhere and we don't have to count the characters on-the-fly.

The last two steps will require a single loop, so will be O(n).

Code-Apprentice
  • 81,660
  • 23
  • 145
  • 268
  • Thanks, clear now! I do think though that you meant "The first 2 steps are..." right? – Zerenity May 23 '23 at 19:17
  • 1
    @Zerenity I can't count. – Code-Apprentice May 23 '23 at 21:19
  • 2
    Note that allocating a new array can be O(n) due to the mandated zero filling, but in case of a `String` construction we can expect the JVM to do the magic of eliminating this step, as the array is definitely filled with content. But even without the optimization, the overall time complexity wouldn’t change. And it’s worth mentioning explicitly that the question’s assumption is correct, the `variant1` has O(n²) time complexity due to performing the concatenation in a loop. – Holger Jun 01 '23 at 16:11