How doing String-Programming in Swift

Question

I miss usable String-functions, that are easy to use, without typing lines of strange identifiers. So I decided to built up a libary with useful and recognicable String-Functions.

I would recommend reading through this chapter of the language specification https://developer.apple.com/library/prerelease/ios/documentation/Swift/Conceptual/Swift_Programming_Language/StringsAndCharacters.html — Nicola Miotto, Aug 24 '14 at 13:35
+1 to both question and answer, for splitting from the original post. Welcome to StackOverflow! — Alexis Pigeon, Aug 24 '14 at 13:37
@Nicola: That is, what I have done, but it needed very much time. This post will be for all newcomers, which want easily do something with Strings and Swift without reading for hours. When you read through my answer and will find something, that can be done more easily, than please left a comment for this. Thanks. — j.s.com, Aug 24 '14 at 13:41

j.s.com · Accepted Answer · 2014-08-24T15:24:47.760

I first tried to use Cocoa String-Functions to solve this problem. So I tried in the playground:

import Cocoa

func PartOfString(s: String, start: Int, length: Int) -> String
{
    return s.substringFromIndex(advance(s.startIndex, start - 1)).substringToIndex(advance(s.startIndex, length))
}

PartOfString("HelloaouAOUs.World", 1, 5) --> "Hello"
PartOfString("HelloäöüÄÖÜß…World", 1, 5) --> "Hello"

PartOfString("HelloaouAOUs.World", 1, 18) --> "HelloaouAOUs.World"
PartOfString("HelloäöüÄÖÜß…World", 1, 18) --> "HelloäöüÄÖÜß…World"

PartOfString("HelloaouAOUs.World", 6, 7) --> "aouAOUs"
PartOfString("HelloäöüÄÖÜß…World", 6, 7) --> "äöüÄO"

If UnCode Characters are in the String for the case, that "substringFromIndex" is not the Start-Index. And even worse, the Swift-Program crashes sometimes at running time, if UnCode-Characters are in a String, for the case, that "substringFromIndex" is not the Start-Index. So I decided to create a set of new Functions, that take care of this problem and work with UnCode-Characters. Please note, that filenames can contain UnCode-Characters as well. So if you think you do not need UnCode-Characters you are wrong.

If you want to reproduce this, you need the same String I used, because copying from this Web-Page does not reproduce the problem.

var s: String = "HelloäöüÄÖÜß…World"
var t: String = s.stringByAddingPercentEscapesUsingEncoding(NSUTF8StringEncoding)!
var u: String = "Helloa%CC%88o%CC%88u%CC%88A%CC%88O%CC%88U%CC%88%C3%9F%E2%80%A6World".stringByRemovingPercentEncoding!
var b: Bool = (s == u) --> true
PartOfString(s, 6, 7) --> "äöüÄO"

Now you could get the idea, to convert the disturbing Canonical-Mapping UniCodes to compatible one with the following function:

func percentescapesremove (s: String) -> String
{
    return (s.stringByRemovingPercentEncoding!.precomposedStringWithCompatibilityMapping)
}

And the result you will get is:

var v: String = percentescapesremove(t) --> "HelloäöüÄÖÜß...World"
PartOfString(v, 6, 7) --> "äöüÄÖÜß"
var a: Bool = (s == v) --> false

When you do so, the "äöüÄÖÜß" looks good and you think, everything is OK but look at the "..." which has been permanently converted from UniCode "…" to non-UniCode "..." and has the result which is not identically to the first string. If you have UniCode-filenames, then converting will result in not finding the file on a volume. So it is a good idea to convert only for scree-output and keep the original String in a save place.

The problem with the PartOfString-Function above is, that it generates a new String in the first part of the assignment and uses this new String with the index of the old one, which does not work, because the UniCodes have a different length than the normal letters. So I improved the funktion (thank to Martin R for his help):

func NewPartOfString(s: String, start: Int, length: Int) -> String
{
    let t: String = s.substringFromIndex(advance(s.startIndex, start - 1))
    return t.substringToIndex(advance(t.startIndex, length))
}

And the result is correct:

NewPartOfString("HelloaouAOUs.World", 1, 5) --> "Hello"
NewPartOfString("HelloäöüÄÖÜß…World", 1, 5) --> "Hello"

NewPartOfString("HelloaouAOUs.World", 1, 18) --> "HelloaouAOUs.World"
NewPartOfString("HelloäöüÄÖÜß…World", 1, 18) --> "HelloäöüÄÖÜß…World"

NewPartOfString("HelloaouAOUs.World", 6, 7) --> "aouAOUs"
NewPartOfString("HelloäöüÄÖÜß…World", 6, 7) --> "äöüÄÖÜß"

In the next step I will show a few functions, that can be used and work well. All of them are based on Integer-Index-Values that will start at 1 for the first character end end with the index for the last character being identically to the length of the String.

This function returns the length of a string:

func len (s: String) -> Int
{
    return (countElements(s)) // This works not really fast, because of UniCode
}

This function returns the UniCode-Number of the first UniCode-Character in the String:

func asc (s: String) -> Int
{
    if (s == "")
    {
        return 0
    }
    else
    {
        return (Int(s.unicodeScalars[s.unicodeScalars.startIndex].value))
    }
}

This function returns the UniCode-Character of the given UniCode-Number:

func char (c: Int) -> String
{
    var s: String = String(UnicodeScalar(c))
    return (s)
}

This function returns the Upper-Case representation of a String:

func ucase (s: String) -> String
{
    return (s.uppercaseString)
}

This function returns the Lower-Case representation of a String:

func lcase (s: String) -> String
{
    return (s.lowercaseString)
}

The next Function gives the left part of a String with a given length:

func left (s: String, length: Int) -> String
{
    if (length < 1)
    {
        return ("")
    }
    else
    {
        if (length > len(s))
        {
            return (s)
        }
        else
        {
            return (s.substringToIndex(advance(s.startIndex, length)))
        }
    }
}

The next Function gives the right part of a String with a given length:

func right (s: String, laenge: Int) -> String
{
    var L: Int = len(s)
    if (L <= laenge)
    {
        return(s)
    }
    else
    {
        if (laenge < 1)
        {
            return ("")
        }
        else
        {
            let t: String = s.substringFromIndex(advance(s.startIndex, L - laenge))
            return t.substringToIndex(advance(t.startIndex, laenge))
        }
    }
}

The next Function gives the part of a String with a given length:

func mid (s: String, start: Int, laenge: Int) -> String
{
    if (start <= 1)
    {
        return (left(s, laenge))
    }
    else
    {
        var L: Int = len(s)
        if ((start > L) || (laenge < 1))
        {
            return ("")
        }
        else
        {
            if (start + laenge > L)
            {
                let t: String = s.substringFromIndex(advance(s.startIndex, start - 1))
                return t.substringToIndex(advance(t.startIndex, L - start + 1))
            }
            else
            {
                let t: String = s.substringFromIndex(advance(s.startIndex, start - 1))
                return t.substringToIndex(advance(t.startIndex, laenge))
            }
        }
    }
}

A little more difficult is to get a character at a given position, because we cannot use "substringFromIndex" and "substringToIndex" with "substringFromIndex" is not the Start-Index. So the idea is to trace through the string, character for character, and get the needed substring.

func CharacterOfString(s: String, index: Int, length: Int) -> String
{
    var c: String = ""
    var i: Int = 0
    for UniCodeChar in s.unicodeScalars
    {
        i = i + 1
        if ((i >= index) && (i < index + length))
        {
            c = c + String(UniCodeChar)
        }
    }
    return (c)
}

But this works not correctly for Strings which contain UniCode-Characters. The following examples show what happens:

CharacterOfString("Swift Example Text aouAOUs.", 16, 8) --> "ext aouA"
len(CharacterOfString("Swift Example Text aouAOUs.", 16, 8)) --> 8

CharacterOfString("Swift Example Text äöüÄÖÜß…", 16, 8) --> "ext äö"
len(CharacterOfString("Swift Example Text äöüÄÖÜß…", 16, 8)) --> 6

So we see, that the resulting String is too short, because a UniCode-Character can contain more than one character. This is because "ä" can be one UniCode-Character and also written as two "a¨" UniCode-Character. So we need another way to get a valid substring.

The solution is, to convert the UniCode-String to an array of UniCode-Characters and to use the index af the array to get a valid character. This works in all cases to get a single Character of an UniCode-String at a given index:

func indchar (s: String, i: Int) -> String
{
    if ((i < 1) || (i > len(s)))
    {
        return ("")
    }
    else
    {
        return String(Array(s)[i - 1])
    }
}

And with this knowledge, I have built a Function, which can get a valid UniCode-Substring with a given Start-Index and a given length:

func substring(s: String, Start: Int, Length: Int) -> String
{
    var L: Int = len(s)
    var UniCode = Array(s)
    var result: String = ""
    var TheEnd: Int = Start + Length - 1

    if ((Start < 1) || (Start > L))
    {
        return ("")
    }
    else
    {
        if ((Length < 0) || (TheEnd > L))
        {
            TheEnd = L
        }
        for var i: Int = Start; i <= TheEnd; ++i
        {
            result = result + String(UniCode[i - 1])
        }
        return (result)
    }
}

The next Function searches for the position of a given String in another String:

func position (original: String, search: String, start: Int) -> Int
{
    var index = part(original, start).rangeOfString(search)
    if (index != nil)
    {
        var pos: Int = distance(original.startIndex, index!.startIndex)
        return (pos + start)
    }
    else
    {
        return (0)
    }
}

This function looks, if a given Character-Code is a number (0-9):

func number (n: Int) -> Bool
{
    return ((n >= 48) & (n <= 57)) // "0" to "9"
}

Now the basic String-Operations are shown, but what about Numbers? How will numbers converted to Strings and vice versa? Let's have a look at converting Strings to Numbers. Please not the "!" in the second line, which is used to get a Int and not an optional Int.

var s: String = "123" --> "123"
var v: Int = s.toInt() --> (123)
var v: Int = s.toInt()! --> 123

But this does not work, if the String contains some characters:

var s: String = "123." --> "123."
var v: Int = s.toInt()! --> Will result in a Runtime Error, because s.toInt() = nil

So I decided to built a smater Function to get the value of a String:

func val (s: String) -> Int
{
    var p: Int = 0
    var sign: Int = 0

    if (indchar(s, 1) == "-")
    {
        sign = 1
        p = 1
    }
    while(number(asc(indchar(s, p + 1))))
    {
        p = p + 1
    }
    if (p > sign)
    {
        return (left(s, p).toInt()!)
    }
    else
    {
        return (0)
    }
}

Now the result is correct and does not produce a Runtime-Error:

var s: String = "123." --> "123."
var v: Int = val(s) --> 123

And now the same for Floating-Point Numbers:

func realval (s: String) -> Double
{
    var r: Double = 0
    var p: Int = 1
    var a: Int = asc(indchar(s, p))
    if (indchar(s, 1) == "-")
    {
        p = 2
    }
    while ((a != 44) && (a != 46) && ((a >= 48) & (a <= 57)))
    {
        p = p + 1
        a = asc(indchar(s, p))
    }
    if (p >= len(s)) // Integer Number
    {
        r = Double(val(s))
    }
    else // Number with fractional part
    {
        var mantissa: Int = val(substring(s, p + 1, -1))
        var fract: Double = 0
        while (mantissa != 0)
        {
            fract = (fract / 10) + (Double(mantissa % 10) / 10)
            mantissa = mantissa / 10
            p = p + 1
        }
        r = Double(val(s)) + fract
        p = p + 1
    }
    a = asc(indchar(s, p))
    if ((a == 69) || (a == 101)) // Exponent
    {
        var exp: Int = val(substring(s, p + 1, -1))
        if (exp != 0)
        {
            for var i: Int = 1; i <= abs(exp); ++i
            {
                if (exp > 0)
                {
                    r = r * 10
                }
                else
                {
                    r = r / 10
                }
            }
        }
    }
    return (r)
}

This works for Floating points numbers with exponents:

var s: String = "123.456e3"
var t: String = "123.456e-3"
var v: Double = realval(s) --> 123456
var w: Double = realval(t) --> 0.123456

To generate a String from an Integer is much more simple:

func str (n: Int) -> String
{
    return (String(n))
}

A String of a floating point variable does not work with String(n) but can be done with:

func strreal (n: Double) -> String
{
    return ("\(n)")
}

I cannot reproduce your initial statement about `println(PartOfString("HelloäöüÄÖÜß…World", 6, 7))`, it returns the correct result `"äöüÄÖÜß"` in my app. But creating an array of all characters of a string just to extract a substring or a single character is very inefficient for really long strings. — Martin R, Aug 24 '14 at 13:46
I also have some doubt if its worth to "re-program" the conversions to int or real values. If you don't like the strict behaviour of `toInt()` then you can still use `intValue()` / `doubleValue()` from Cocoa, e.g. `let i = (s as NSString).doubleValue`. — Martin R, Aug 24 '14 at 13:49
I used the UniCode String ""Hello%C3%A4%C3%B6%C3%BC%C3%84%C3%96%C3%9C%C3%9F...World". When copying it from this Web-Page it works, but not the original one. The reason is the different codings for same optical appearence. If you want to reproduce this, use: PartOfString("Hello%C3%A4%C3%B6%C3%BC%C3%84%C3%96%C3%9C%C3%9F...World".stringByRemovingPercentEncoding!.decomposedStringWithCanonicalMapping, 6, 8) — j.s.com, Aug 24 '14 at 13:59
It seems to work correctly if you use `substringWithRange` (as in some of the answers to this question http://stackoverflow.com/questions/24029163/finding-index-of-character-in-swift-string): `let startIndex = advance(s.startIndex, start-1); let endIndex = advance(startIndex, length); return = s.substringWithRange(startIndex ..< endIndex)` — Martin R, Aug 24 '14 at 14:12
The problem with your PartOfString function is that `s.substringFromIndex()` returns a *new* string, so that you cannot use `s.startIndex` safely on that new string, see the last part of http://stackoverflow.com/a/24056932/1187415. — Martin R, Aug 24 '14 at 14:24
Sorry, but I do not think that this is useful for newcomers. Your initial premise *""substringToIndex" does not work correctly"* is wrong. Your PartOfString() function is wrong because it uses the index of one string to index a different string. Then you present other solutions which create an array of all characters first, which is not necessary and ineffective for large strings. — Martin R, Aug 24 '14 at 14:43
I have added your working suggestion, thank you very much for this. Other newcomers make the same mistakes, I guess. Before adding the post here and finding you to solve my problem, I tried debugging the source for several weeks and had some people, who programmed for years in Obj-C, who tried to help me without success. I will add some new routines which take advantage of this knowledge. These ones, which crashed my programs until now. — j.s.com, Aug 24 '14 at 15:00
Your "answer" is a mix of problem descriptions, failed solution attempts and solutions, written in a blog-like style. I find it very difficult to read. SO uses a strict separation between questions stating the problem, and answers providing the solution. — Martin R, Aug 24 '14 at 15:15
@Martin: You should imagine, that this is my first post. With comments like yours, you discourage me, that I must think about posting again. I think a blog like this lives from people who invest time to write something useful. I have read very much posts here before posting for myself. But the most posts were not be very satisfied, because the codes did not work and were very difficult to understand for a newcomer like me. So I decided to make a very detailed post, so everyone can copy a whole working funktion instead of a code fragment which does not work, because of missing init-codes. — j.s.com, Aug 24 '14 at 15:31

How doing String-Programming in Swift

1 Answers1