How to iterate on a string using subscripts in swift

Question

I come from a java background and String slicing in swift appears very verbose and difficult to me. I am trying to answer this leet code question with an algorithm that works with subscripts.

Given a string s, find the length of the longest substring without repeating characters.
Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.

Here is my code:

func lengthOfLongestSubstring(_ s: String) -> Int {
 var maximum = 0
 var start = 0
 var end = 0
 var set = Set<Character>()
 while end < s.count{
     if set.contains(s[end]){
         set.remove(s[start])
         start+=1
     }
     else{
         set.insert(s[end])
         end+=1
         maximum = max(maximum, end-start)
     }
 }
}

I worked with Int subscripts. But I get this error message: 'subscript(_:)' is unavailable: cannot subscript String with an Int, use a String.Index instead. How can I solve it without getting too verbose?

See also: https://docs.swift.org/swift-book/LanguageGuide/StringsAndCharacters.html#ID494 — jtbandes, Dec 31 '21 at 23:58
You will have an easier time if you simply say `Array(s)` to break the string into an array of characters. Now you can use Int indexing. — matt, Dec 31 '21 at 23:59
The code in the question shows that it's not only about accessing the nth character, but about iterating over a string and working with positions in a string. I propose therefore to reopen the question — Christophe, Jan 01 '22 at 00:07
@matt Would taking this approach on similar string questions help? — Tom Hiddleston, Jan 01 '22 at 00:20
I don't know what that means. I'm just trying to give you an easy way to keep your algorithm. It would be more proper to learn how Swift strings really work. — matt, Jan 01 '22 at 00:52
@matt I was referring to turning a string into an array for easy manipulation — Tom Hiddleston, Jan 01 '22 at 01:09
I know what you were referring to. See my previous comment for my response. — matt, Jan 01 '22 at 01:17

Duncan C · Answer 1 · 2022-01-01T14:52:08.720

3

Other languages make simplifying assumptions about Unicode strings that Swift does not. You can't be sure how many code-points a given character takes to store, so each time you index to the nth character of a unicode string it is an O(n) operation. Thus code that uses random access integer indexing into Unicode has unexpected O(n²) performance. For that reason Swift does not offer integer indexing.

The simple way to get integer indexing into the characters of a Swift string, as Matt suggests in his comment, is to convert the String to an Array of characters. You'll pay the O(n) cost for that conversion ONCE (as well as memory costs for extra storage) and then have fixed-time access to characters after that.

Alternately you can learn how to index into Swift strings using a String.Index, but that will require you to change your current approach.

edited Jan 01 '22 at 14:52

answered Jan 01 '22 at 02:44

Duncan C

128,072
22
173
272

1

To add to this, it's important to note that turning a `String` into an `Array` is an `O(n)` copy, which can be avoidable. Many string manipulation algorithms _don't_ require integer indexing, and even if an algorithm is expressed in terms of integer indexing, if it doesn't require truly random access into a string, it may be expressible in terms of `String` slicing operations instead — these require no copies. Subscripting and functions like [`dropFirst(_:)`](https://developer.apple.com/documentation/swift/string/2893343-dropfirst) are your friend. – Itai Ferber Jan 01 '22 at 03:20
True, converting a string requires `O(n)` time, once. However, that prevents you from accidentally creating `O(n²)` performance, since integer indexing into a Unicode string would operate in `O(n)` time for every operation. – Duncan C Jan 01 '22 at 12:55
I edited my answer to make explicit the fact that converting a String to an Array has a one-time `O(n)` cost. – Duncan C Jan 01 '22 at 14:52
Yes! I think that's a great improvement. I mostly wanted to express in my comment that many folks new to Swift reach for integer-based indexing because they're used to it from other languages, and `Sequence`- and `Collection`-based methods are preferable in many (most?) situations, but this does help clarify. – Itai Ferber Jan 01 '22 at 15:44
1

@ItaiFerber I totally agree. That is why in my comment I said that learning how to work with strings is the better solution. I was just trying to give the OP a way to move forward without doing that. When the OP asked if that was a good idea in general, I demurred, for the very reason you give. – matt Jan 01 '22 at 16:30

Christophe · Answer 2 · 2022-01-01T14:12:47.190

Why is it so difficult to accessing characters in a string?

If strings would be made of fixed sized characters stored in arrays, finding the n-th character would be straightforward and efficient using an integer subscript.

But popular string encodings such as UTF-8 and UTF-16 use a variable number of bytes/words to store characters. So a direct access to the n-th character requires counting characters one by one from the start or other costly strategies, to deal correctly with strings such as "Abçdefg" (9 characters, but 11 UTF16 words and 16 UTF8 bytes)

Quick fix

Since the subscript requires an index and you work with integers, you can get rid of the error, by replace the faulty s[i] with:

s[s.index(s.startIndex, offsetBy: i))]

This is not very concise. But it has the advantage of drawing your attention at the complexity of computing and indexes from an integer character count. (And if you'd profile your code working with very long strings, you'll quickly find out it's a bottleneck).

How about making better use of indexes?

On the other side, many algorithms (including yours) access characters one after another. So there is no need to recount characters from the start over and over again, if you work with relative positions. For this reason, Swift designers chose to subscript strings with indexes.

A better approach would therefore be to use Swift's native way: work with indexes instead of integers and iterate over the string:

func lengthOfLongestSubstring(_ s: String) -> Int {
    var maximum = 0
    var start = s.startIndex   // starting at start of the string
    var end = s.startIndex
    var set = Set<Character>()
    while end != s.endIndex{  // is end of string reached ? 
        if set.contains(s[end]){
            set.remove(s[start])
            start = s.index(after: start)  // going to the next character
        }
        else{
            set.insert(s[end])
            end = s.index(after:end)    // going to the next character
            maximum = max(maximum, s.distance(from: start, to:end)) // count characters between indexes
        }
    }
    return maximum
}

Here a quick introduction on useful indexing functions.

How to iterate on a string using subscripts in swift

2 Answers2

Why is it so difficult to accessing characters in a string?

Quick fix

How about making better use of indexes?