9

I have the need to remove leading and trailing spaces around a punctuation character.

For example: Hello , World ... I 'm a newbie iOS Developer.

And I'd like to have: > Hello, World... I'm a newbie iOS Developer.

How can I do this? I tried to get components of the string and enumerate it by sentences. But that is not what I need

Ozgur Vatansever
  • 49,246
  • 17
  • 84
  • 119
Vika Grinyuk
  • 123
  • 1
  • 1
  • 4
  • That kind of function is called trim() Duplicate https://stackoverflow.com/questions/26797739/does-swift-have-a-trim-method-on-string – zvone Oct 02 '17 at 20:05
  • 1
    I don’t think trim would do what they want. Note that there’s still spaces in the desired result. This is much trickier than that. – Catfish_Man Oct 02 '17 at 20:05
  • 2
    @zvone no, trim only works for the beginning and the end of the string, it does not look inside the body. – Eric Aya Oct 02 '17 at 20:07
  • 1
    I think your best bet is to look into regular expressions (RegEx). I'm only passingly familiar with them but you can setup an expression that looks for spaces next to punctuation and replace it with just the punctuation. It will for sure take a fair amount of testing and experimentation to get the expression right. – theMikeSwan Oct 02 '17 at 20:12
  • You can go to regexpal.com to try out various expressions and see what works. – theMikeSwan Oct 02 '17 at 20:14
  • On my phone without a computer to write you a sample, but if it's always a natural language string and especially if you need to support languages other than English, you might check out NSLinguisticTagger which has punctuation and white space tags Intro: http://nshipster.com/nslinguistictagger/ Docs: https://developer.apple.com/documentation/foundation/nslinguistictagger – Dad Oct 02 '17 at 21:55

8 Answers8

8

Rob's answer is great, but you can trim it down quite a lot by taking advantage of the \p{Po} regular expression class. Getting rid of the spaces around punctuation then becomes a single regular expression replace:

import Foundation

let input = "Hello ,  World ... I 'm a newbie iOS Developer."
let result = input.replacingOccurrences(of: "\\s*(\\p{Po}\\s?)\\s*",
                                        with: "$1",
                                        options: [.regularExpression])
print(result) // "Hello, World... I'm a newbie iOS Developer."

Rob's answer also tries to trim leading/trailing spaces, but your input doesn't have any of those. If you do care about that you can just call result.trimmingCharacters(in: .whitespacesAndNewlines) on the result.


Here's an explanation for the regular expression. Removing the double-escapes it looks like

\s*(\p{Po}\s?)\s*

This is comprised of the following components:

  • \s* - Match zero or more whitespace characters (and throw them away)
  • (…) - Capturing group. Anything inside this group is preserved by the replacement (the $1 in the replacement refers to this group).
    • \p{Po} - Match a single character in the "Other_Punctuation" unicode category. This includes things like ., ', and , but excludes things like ( or -.
    • \s? - Match a single optional whitespace character. This preserves the space after periods (or ellipses).
  • \s* - Once again, match zero or more whitespace characters (and throw them away). This is what turns your , World into , World.
Lily Ballard
  • 182,031
  • 33
  • 381
  • 347
7

For Swift 3 or 4 you can use :

let trimmedString = string.trimmingCharacters(in: .whitespaces)
Fox5150
  • 2,180
  • 22
  • 24
  • 8
    This method only works if your spaces are at the start or end of a string. It will ignore any spaces in between words. – Makoren Jan 08 '21 at 00:52
5

This is a really wonderful problem and a shame that it isn't easier to do in Swift today (someday it will be, but not today).

I kind of hate this code, but I'm getting on a plane for 20 hours, and don't have time to make it nicer. This may at least get you started using NSMutableString. It'd be nice to work in String, and Swift hates regular expressions, so this is kind of hideous, but at least it's a start.

import Foundation

let input = "Hello,  World ... I 'm a newbie iOS Developer."

let adjustments = [
    (pattern: "\\s*(\\.\\.\\.|\\.|,)\\s*", replacement: "$1 "), // elipsis or period or comma has trailing space
    (pattern: "\\s*'\\s*", replacement: "'"), // apostrophe has no extra space
    (pattern: "^\\s+|\\s+$", replacement: ""), // remove leading or trailing space
]

let mutableString = NSMutableString(string: input)

for (pattern, replacement) in adjustments {
    let re = try! NSRegularExpression(pattern: pattern)
    re.replaceMatches(in: mutableString,
                      options: [],
                      range: NSRange(location: 0, length: mutableString.length),
                      withTemplate: replacement)
}
mutableString // "Hello, World... I'm a newbie iOS Developer."

Regular expressions can be very confusing when you first encounter them. A few hints at reading these:

  • The specific language Foundation uses is described by ICU.

  • Backslash (\) means "the next character is special" for a regex. But inside a Swift string, backslash means "the next character is special" of the string. So you have to double them all.

  • \s means "a whitespace character"

  • \s* means "zero or more whitespace characters"

  • \s+ means "one or more whitespace characters"

  • $1 means "the thing we matched in parentheses"

  • | means "or"

  • ^ means "start of string"

  • $ means "end of string"

  • . means "any character" so to mean "an actual dot" you have to type "\\." in a Swift string.

Notice that I check for both "..." and "." in the same regular expression. You kind of have to do something like that, or else the "." will match three times inside the "...". Another approach would be to first replace "..." with "…" (the single ellipsis character, typed on a Mac by pressing Opt-;). Then "…" is a one-character punctuation. (You could also decide to re-expand all ellipsis back to dot-dot-dot at the end of the process.)

Something like this is probably how I'd do it in real life, get it done and shipped, but it may be worth the pain/practice to try to build this as a character-by-character state machine, walking one character at a time, and keeping track of your current state.

Farzad Karimi
  • 770
  • 1
  • 12
  • 31
Rob Napier
  • 286,113
  • 34
  • 456
  • 610
  • You can use `\p{Po}` in your regular expression to cover the "Other Punctuation" class, of which `.` and `'` are both members. This way you're not hard-coding the punctuation. – Lily Ballard Oct 02 '17 at 21:02
  • @KevinBallard That won't work here because `.` and `'` have different rules. `.` requires a trailing space, but `'` (at least this use of `'`) requires there be no trailing space. As this algorithm grows to handle the many corner cases, "other punctuation" is going to be even more inflexible. I believe you'll find you need to hard code much *more* of the punctuation to handle those complexities. – Rob Napier Oct 04 '17 at 09:58
  • In the sample input, there is no instance of a `'` with a trailing space. It's unclear to me is that's something that needs to be handled. – Lily Ballard Oct 04 '17 at 20:06
2

You can try something like string.replacingOccurrences(of: " ,", with: ",") for every punctuation...

zvone
  • 39
  • 2
1

Simplest answer (to my knowledge) to remove all spaces in a string:

originalString.filter { $0 != " " }

Strings being arrays, we can use all Array functions on them.

Jean Le Moignan
  • 22,158
  • 3
  • 31
  • 37
0

Interesting problem; here's my stab at a non-Regex approach:

func correct(input: String) -> String {
    typealias Correction = (punctuation: String, replacement: String)

    let corrections: [Correction] = [
        (punctuation: "...", replacement: "... "),
        (punctuation: "'", replacement: "'"),
        (punctuation: ",", replacement: ", "),
    ]

    var transformed = input
    for correction in corrections {
        transformed = transformed
            .components(separatedBy: correction.punctuation)
            .map({ $0.trimmingCharacters(in: .whitespaces) })
            .joined(separator: correction.replacement)
    }

    return transformed
}

let testInput = "Hello , World ... I 'm a newbie iOS Developer."
let testOutput = correct(input: testInput)

// Hello, World... I'm a newbie iOS Developer.
johnpatrickmorgan
  • 2,372
  • 2
  • 13
  • 17
0

If you were doing this manually by processing characters arrays, you would merely need to check the previous and next characters around spaces. You can achieve the same result using functional style programming with zip, filter and map:

let testInput = "Hello , World ... I 'm a newbie iOS Developer."

let punctuation    = Set(".\',")
let previousNext   = zip( [" "] + testInput, String(testInput.dropFirst()) + [" "] )
let filteredChars  = zip(Array(previousNext),testInput)
                    .filter{  $1 != " "
                              || !($0.0 != " " && punctuation.contains($0.1))
                           }
let filteredInput = String(filteredChars.map{$1})

print(testInput)     // Hello , World ... I 'm a newbie iOS Developer.
print(filteredInput) // Hello, World... I'm a newbie iOS Developer.
Alain T.
  • 40,517
  • 4
  • 31
  • 51
-4

Swift 4, 4.2 and 5

let str = "  Akbar Code  "
let trimmedString = str.trimmingCharacters(in: .whitespaces)
Akbar Khan
  • 2,215
  • 19
  • 27