1

Say you have

    let fixed = t.replacingOccurrences(
        of: #"(?m)"aa[.\s\S]*?bb"#,
        with: ,
        options: [.regularExpression])

You pass in the text of a file, let's say.

This will of course find all the long strings "aa . . . bb" within the text.

If we do this ...

    let fixed = t.replacingOccurrences(
        of: #"(?m)"aa[.\s\S]*?bb"#,
        with: "WOW! $0",
        options: [.regularExpression])

it will replace each of the "aa . . . bb" with "WOW aa . . . bb".

If we do this ...

    let fixed = t.replacingOccurrences(
        of: #"(?m)"aa[.\s\S]*?bb"#,
        with: "$0 $0 $0",
        options: [.regularExpression])

it will replace each of the "aa . . . bb" with three copies of that "aa . . . bb".

What if we want to replace each of the "aa . . . bb" with an UPPERCASED version of "aa . . . bb"

So conceptually something like this ...

    let fixed = t.replacingOccurrences(
        of: #"(?m)"aa[.\s\S]*?bb"#,
        with: "$0".uppercased(),
        options: [.regularExpression])

Of course, that is not how StringProtocol works, it is meaningless.

Thus, I essentially want to do this:

    let fixed = t.replacingOccurrences(
        of: #"(?m)"aa[.\s\S]*?bb"#,
        with: process( ... the actual value of $0 ... ),
        options: [.regularExpression])

Where func process(_ s: String) -> String.

How do you do that, how do you "get at" the value in that pass of the string protocol (ie, of the $0) and then (say) call a function with it, returning a string to use as the result for that pass?


Just FTR, specifically I want to remove all whitespace/newlines from each of the "aa . . . bb". (Example "1 2 3 aa x y z bb 1 2 3 aa p q r bb" becomes "1 2 3 aaxyzbb 1 2 3 aapqrbb"

Fattie
  • 27,874
  • 70
  • 431
  • 719

1 Answers1

2

Use the Swift regex APIs. There are overloads of replacing(_:with:) that take RegexComponents. You can pass a closure that takes a Match, and computes a replacement.

Example for uppercasing the matches:

let text = "something else aa foo bar baz bb something else"
let regex = /(?m)aa[\s\S]*?bb/
let result = text.replacing(regex, with: { match in
    // you can do whatever you want with "match" here,
    // then return a collection of characters
    match.output.uppercased()
})
print(result)
// something else AA FOO BAR BAZ BB something else
Sweeper
  • 213,210
  • 22
  • 193
  • 313
  • or simply `text.replacing(regex) { $0.output.uppercased() }` or `text.replacing(regex, with: \.output.localizedUppercase)` – Leo Dabus Jul 28 '23 at 02:48
  • Sweeper - brilliant answer thanks! Didn't have a clue about the existence of `{ match in` Massive bounty en route – Fattie Jul 28 '23 at 11:33
  • @LeoDabus - ah! `.output` is indeed what I was looking for, had no idea about it. Where the hell is the doco for that ?!! – Fattie Jul 28 '23 at 11:35
  • @Fattie Are you talking about being able to pass a keypath to a parameter that expects a function type? [See this post](https://stackoverflow.com/q/63764608/5133585). Otherwise, `output` is documented [here](https://developer.apple.com/documentation/swift/regex/match/output). It just gets the output of the `RegexComponent`. For a regex literal with no captures, it's simply a `Substring`. – Sweeper Jul 28 '23 at 11:39
  • Sweeper - Got it, awesome. Actually I do have (or would like to have) two captures (to be clear, normally I'd use them like $0 $1 ...) I'm afraid I don't understand how to extract the captures from `match` ... if that's possible ? :O – Fattie Jul 28 '23 at 11:48
  • @Fattie When there are captures in a regex literal, `output` would be of a tuple type, e.g. `(Substring, Substring, Substring)`. You can get `output.0`, `output.1` etc for the groups. I would recommend checking out the WWDC videos on this. – Sweeper Jul 28 '23 at 11:51
  • Sweeper - INCREDIBLE! (I was stupidly `match.0` instead of `match.output.0` ) TY so much, golden – Fattie Jul 28 '23 at 12:16