1

I'm a little confused about the best practices for Swift 4 string manipulation.

How do you handle the following:

let str = "test"
let start = str.index(str.startIndex, offsetBy: 7)


Thread 1: Fatal error: cannot increment beyond endIndex

Imagine that you do not know the length of the variable 'str' above. And since 'start' is not an optional value, what is the best practice to prevent that crash?

Hamish
  • 78,605
  • 19
  • 187
  • 280
Buyin Brian
  • 2,781
  • 2
  • 28
  • 48

2 Answers2

4

If you use the variation with limitedBy parameter, that will return an optional value:

if let start = str.index(str.startIndex, offsetBy: 7, limitedBy: str.endIndex) {
    ... 
}

That will gracefully detect whether the offset moves the index past the endIndex. Obviously, handle this optional however best in your scenario (if let, guard let, nil coalescing operator, etc.).

Rob
  • 415,655
  • 72
  • 787
  • 1,044
2

Your code doesn't do any range checking:

let str = "test"
let start = str.index(str.startIndex, offsetBy: 7)

Write a function that tests the length of the string first. In fact, you could create an extension on String that lets you use integer subscripts, and returns a Character?:

extension String {
  //Allow string[Int] subscripting. WARNING: Slow O(n) performance
  subscript(index: Int) -> Character? {
    guard index < self.count else { return nil }
    return self[self.index(self.startIndex, offsetBy: index)]
  }
}

This code:

var str = "test"
print("str[7] = \"\(str[7])\"")

Would display:

str[7] = "nil"

##EDIT:

Be aware, as Alexander pointed out in a comment below, that the subscript extension above has up to O(n) performance (it takes longer and longer as the index value goes up, up to the length of the string.)

If you need to loop through all the characters in a string code like this:

for i in str.count { doSomething(string: str[i]) } 

would have O(n^2) (Or n-squared) performance, which is really, really bad. in that case, you should instead first convert the string to an array of characters:

 let chars = Array(str.characters)

 for i in chars.count { doSomething(string: chars[i]) } 

or

 for aChar in chars { //do something with aChar }

With that code you pay the O(n) time cost of converting the string to an array of characters once, and then you can do operations on the array of characters with maximum speed. The downside of that approach is that it would more than double the memory requirements.

Duncan C
  • 128,072
  • 22
  • 173
  • 272
  • 1
    I would discourage such a subscript. It makes it seems like a `O(1)` operation, when in fact it's `O(self.count)`. Code like `for i in str.indices { doSomething(str[i])` look `O(str.count)`, when in fact it's `O(str.count^2)` – Alexander Dec 10 '17 at 18:16
  • @Alexander - `index(_:offsetBy:limitedBy:)` is O(n), too. So how do you propose to achieve O(1)? That having been said, I'd still use `index(_:offsetBy:limitedBy:)` (why manually do this when there's a function that does it for you), but I'm not sure how you'd achieve the O(1) behavior you suggest. – Rob Dec 10 '17 at 18:44
  • Alexander, fair point about the performance hit of using a subscript operator with hidden `O(n)` performance. If you need to do a series of operations then you could get `O(n)` for the entire array operation by first converting the string to an array of characters and then operating on the array using `Array(aString.characters)` – Duncan C Dec 10 '17 at 18:57
  • Then you could use `chars = Array(aString.characters); for aChar in chars {//code}` and get `O(n)` processing all the characters. – Duncan C Dec 10 '17 at 19:01
  • @Rob I'm not suggesting there's a general `O(1)` solution, I'm just expressing my disapproval of hiding it behind a subscript operator, which makes it easily get overlooked as a constant-time indexing operation, like into an array. – Alexander Dec 10 '17 at 19:12
  • I've edited my answer to warn about the `O(n)` cost of my subscript extension. – Duncan C Dec 10 '17 at 19:16