Swift 5: Index of a Character in String

Question

Before Swift 5, I had this extension working:

  fileprivate extension String {
        func indexOf(char: Character) -> Int? {
            return firstIndex(of: char)?.encodedOffset
        }
    }

Now, I get a deprecated message:

'encodedOffset' is deprecated: encodedOffset has been deprecated as most common usage is incorrect. Use `utf16Offset(in:)` to achieve the same behavior.

Is there a simpler solution to this instead of using utf16Offset(in:)?

I just need the index of the character position passed back as an Int.

Don't use encodedOffset. Use the collection distance https://stackoverflow.com/a/34540310/2303865 — Leo Dabus, Apr 25 '19 at 03:07

Robert Dresler · Accepted Answer · 2022-08-15T21:41:57.960

11

After some time I have to admit that my original answer was incorrect.

In Swift are two methods: firstIndex(of:) and lastIndex(of:)

Both returns Int? representing index of first/last element in Array which is equal to passed element (if there is any, otherwise it returns nil).

So, you should avoid using your custom method to get index because there could be two same elements and you wouldn't know which index you need. So try to think about your usage and decide which index is more suitable for you; first or last.

Original answer:

And what is wrong with utf16Offset(in:)? This is way to go with Swift 5

fileprivate extension String {
    func indexOf(char: Character) -> Int? {
        return firstIndex(of: char)?.utf16Offset(in: self)
    }
}

edited Aug 15 '22 at 21:41

answered Mar 27 '19 at 19:05

Robert Dresler

10,580
2
22
40

2

I rest my case!. I was having issues with utf16Offset(in: self) part. – Gizmodo Mar 27 '19 at 19:07
7

You should use Collection distance method `distance(from start: String.Index, to end: String.Index) -> String.IndexDistance` – Leo Dabus Apr 25 '19 at 03:01
4

try `"".indexOf(char: "") // 4` – Leo Dabus Apr 25 '19 at 03:01
5

Can confirm this is *not* the correct answer; Leo Dabus’ comments should *be* an answer and marked as accepted. The UTF-16 offset is, due to `Character` representing grapheme clusters of varying sizes. Treating every string as though it were UTF-16 regardless of the mix of character-sizes is decidedly incorrect. The *only* correct way is to use the `Collection` functions for converting between `Int` offsets and `String.Index`. – Joshua Nozzi May 01 '19 at 20:14
1

Felt I should back up my statement above with a handy explanatory link: http://utf8everywhere.org/#myths - this section of this *very informative* site describes very clearly the reason you cannot rely on fixed-width offsets for *any* Unicode text (and `Swift.String` is an array of `Swift.Character`, which is generally a wrapper around *grapheme clusters*). – Joshua Nozzi May 02 '19 at 15:14
(shakes head at deleted my-lang-does-it-better-but-not-really drama) It’s a complex issue, friends. If you’ve solved it even for one platform (both performance and ergonomics), you are a super-genius and should publish your work. If not, maybe don’t attack others. – Joshua Nozzi May 04 '19 at 14:55
1

Just wanted to point out that @LeoDabus gets it right and points out the flaws in the simplest way possible. I keep seeing the UTF-16 offset answer and KNOW it is incorrect for this very reason. I plan to implement the indexDistance for a parser. – Lloyd Sargent Mar 05 '21 at 23:01

Swift 5: Index of a Character in String

1 Answers1