36

I come from a C# background where System.String is immutable and string concatenation is relatively expensive (as it requires reallocating the string) we know to use the StringBuilder type instead as it preallocates a larger buffer where single characters (Char, a 16-bit value-type) and short strings can be concatenated cheaply without extra allocation.

I'm porting some C# code to Swift which reads from a bit-array ([Bool]) at sub-octet indexes with character lengths less than 8 bits (it's a very space-conscious file format).

My C# code does something like this:

 StringBuilder sb = new StringBuilder( expectedCharacterCount );
 int idxInBits = 0;
 Boolean[] bits = ...;
 for(int i = 0; i < someLength; i++) {
     Char c = ReadNextCharacter( ref idxInBits, 6 ); // each character is 6 bits in this example
     sb.Append( c );
 }

In Swift, I assume NSMutableString is the equivalent of .NET's StringBuilder, and I found this QA about appending individual characters ( How to append a character to string in Swift? ) so in Swift I have this:

var buffer: NSMutableString
for i in 0..<charCount {
    let charValue: Character = readNextCharacter( ... )
    buffer.AppendWithFormat("%c", charValue)
}
return String(buffer)

But I don't know why it goes through a format-string first, that seems inefficient (reparsing the format-string on every iteration) and as my code is running on iOS devices I want to be very conservative with my program's CPU and memory usage.

As I was writing this, I learned my code should really be using UnicodeScalar instead of Character, problem is NSMutableString does not let you append a UnicodeScalar value, you have to use Swift's own mutable String type, so now my code looks like:

var buffer: String
for i in 0..<charCount {
    let x: UnicodeScalar = readNextCharacter( ... )
    buffer.append(x)
}
return buffer

I thought that String was immutable, but I noticed its append method returns Void.

I still feel uncomfortable doing this because I don't know how Swift's String type is implemented internally, and I don't see how I can preallocate a large buffer to avoid reallocations (assuming Swift's String uses a growing algorithm).

Community
  • 1
  • 1
Dai
  • 141,631
  • 28
  • 261
  • 374
  • In Swift, _var_ means _variable_ and _let_ means _constant_. In your case, a var String will be mutable and a let String will be immutable. Character can also be appended to a mutable String. For preallocation, you can use `[Character](count: 100, repeatedValue: "0")` to create an array of `Character`s of a certain length. (And convert it back to String using `String(charArray)`). I would say there's no need for this. Appending is quite fast in Swift. – J.Wang Apr 04 '16 at 06:56
  • For what it's worth, there's a Swift StringBuilder gist on GitHub: https://gist.github.com/kristopherjohnson/1fc55e811d944a430289 It looks like it's intended to implement a subset of the C# StringBuilder class, and could be useful when manually converting C# programs to Swift. (At least, if you're not worried about upsetting the Swift purists who would prefer that the code be rewritten to be done "the Swift way".) But unfortunately it's written for a version of Swift prior to Swift 3, needs about 10 minor changes to be accepted as valid Swift 3. – RenniePet Dec 18 '16 at 05:33
  • @J.Wang Doesn't that mean an "immutable mutable" `String` is used with a `let x: String` statement? The internal representation of a mutable strings vs immutable string can be very different as they optimize for different scenarios (e.g. immutable substrings). – Dai Nov 01 '17 at 17:40

1 Answers1

32

(This answer was written based on documentation and source code valid for Swift 2 and 3: possibly needs updates and amendments once Swift 4 arrives)

Since Swift is now open-source, we can actually have a look at the source code for Swift:s native String

From the source above, we have following comment

/// Growth and Capacity
/// ===================
///
/// When a string's contiguous storage fills up, new storage must be
/// allocated and characters must be moved to the new storage.
/// `String` uses an exponential growth strategy that makes `append` a
/// constant time operation *when amortized over many invocations*.

Given the above, you shouldn't need to worry about the performance of appending characters in Swift (be it via append(_: Character), append(_: UniodeScalar) or appendContentsOf(_: String)), as reallocation of the contiguous storage for a certain String instance should not be very frequent w.r.t. number of single characters needed to be appended for this re-allocation to occur.

Also note that NSMutableString is not "purely native" Swift, but belong to the family of bridged Obj-C classes (accessible via Foundation).


A note to your comment

"I thought that String was immutable, but I noticed its append method returns Void."

String is just a (value) type, that may be used by mutable as well as immutable properties

var foo = "foo" // mutable 
let bar = "bar" // immutable
    /* (both the above inferred to be of type 'String') */

The mutating void-return instance methods append(_: Character) and append(_: UniodeScalar) are accessible to mutable as well as immutable String instances, but naturally using them with the latter will yield a compile time error

let chars : [Character]  = ["b","a","r"]
foo.append(chars[0]) // "foob"
bar.append(chars[0]) // error: cannot use mutating member on immutable value ...
dfrib
  • 70,367
  • 12
  • 127
  • 192
  • 1
    So are `+` and `append` the same in terms of performance? Is `s+= "a"`, `s = s + "a"` and `s.append("a")` do the same work? – Dan M. Apr 03 '17 at 02:15
  • 1
    @DanM. we may visit the (open) source for the stdlib to answer that question: the [`+=` operator](https://github.com/apple/swift/blob/master/stdlib/public/core/String.swift#L547) calls `lhs._core.append(rhs._core)`. The [`+` operator](https://github.com/apple/swift/blob/master/stdlib/public/core/String.swift#L537) creates a new `String` instance to hold the result (named `lhs`), thereafter also calls `lhs._core.append(rhs._core)`. Finally, the [`append(...)` method](https://github.com/apple/swift/blob/master/stdlib/public/core/String.swift#L513) directly calls `_core.append(other._core)`. – dfrib Apr 03 '17 at 18:24
  • ... if we compare the `+=` operator with the `append(...)` method, the prior will perform an extra check for emptiness, as well as passing an extra `inout` reference to the operator method, something that is not present in the latter. So, leaving out the case where `self` is empty, the `append(...)` method could be argued to have a slightly leaner implementation than the `+=` operator, but I believe this should be negligible. The `+` operator is another story altogether as it allocates a new `instance` that is then returns: possible the compiler can optimize this, but you should use `+=` ... – dfrib Apr 03 '17 at 18:28
  • ... or `append(...)` if you wish to mutate a `String` instance (rather than re-assigning to it). – dfrib Apr 03 '17 at 18:29
  • 1
    As `String` has an internal mutable buffer (via `StringCore`, I understand) - is it possible to preallocate that buffer? I can't see any `init` functions that accept a capacity or reserved size parameter. – Dai May 02 '17 at 19:23
  • 1
    This answer is only guaranteed to be correct for Swift 3, as major changes will be made to `String` in Swift 4. To allocate a `String`'s internal buffer, use `s.characters.reserveCapacity(capacity)`. This is a protocol requirement of `RangeReplaceableCollection`, which `String.CharacterView` conforms to. Cf. https://developer.apple.com/reference/swift/string.characterview and https://developer.apple.com/reference/swift/string.characterview/1539044-reservecapacity – BallpointBen May 03 '17 at 18:34
  • 1
    @Dai as BallpointBen:s comments below, you may reserve capacity for a given number of characters (extended grapheme cluster) but accessing the `CharacterView` of a given `String` instance and reserving capacity upon the view. – dfrib May 04 '17 at 06:42
  • @BallpointBen thanks for your comment and your Swift 4 vs Swift <4 prompt. I've added a prompt to the answer and will (try to remember to) look over it once Swift 4 arrives. Thanks! – dfrib May 04 '17 at 06:43
  • @Dai hehe, and that's before even dwelling into Unicode10 stuff and Swift:s `String`/`Character` .... I still [think C++ takes the price](http://www.gotw.ca/gotw/084.htm) in this context though, but then again I'm more well-versed in Swift than in C++ (even if I would prefer the reverse :) ) – dfrib May 04 '17 at 07:54
  • In Swift 4 `String` will be made to conform to `Collection` (again) which will push much of its functionality into default implementations in `Collection` and make `String` much less monolithic. – BallpointBen May 04 '17 at 13:16