-1

I'm having trouble with some Regex in Swift; I've done some looking around but can't seem to get it to work. I've put the matches(for:in) method from Swift extract regex matches into my code.

I have text in my test String that reads "SOURCEKEY:B" and I want to extract the "B". So I pass "SOURCEKEY:([A-Z])" into matches(for:in:) but the result is the full string "SOURCEKEY:B". What am I doing wrong?

My code, by the way (although I think all you need to know is the expression I'm trying)

func testRegEx() {
    let text = getTextFor("Roll To Me")!
    XCTAssertTrue(text.contains("Look around your world"))  // passes
    XCTAssertTrue(text.contains("SOURCEKEY:")) // passes
    let expression = "SOURCEKEY:([A-Z])(?s.)DESTKEY:([A-Z])(?s.)"
    let matchesArray = matches(for: expression, in: text) // matchesArray[0] = "SOURCEKEY:"
}

That's the first part. The ultimate expression I want will break up the text like this (all the text I want returned is backticked below):

SOURCEKEY:B

a bunch of text
more lines of text
these go in the 2nd returned value, where "B" is the first returned value
everything up to...

DESTKEY:E

a bunch more text
these go in the 4th returned value, where "E" is the third returned value
this includes the remainder of the string after that 3rd value

I've managed to successfully do this without regex, to get sourceKey, origText, destKey, and expectedText for the 4 elements referenced above:

    let allComponents = text.components(separatedBy: "KEY:")
    let origTextComponents = allComponents[1].split(separator: "\n", maxSplits: 1, omittingEmptySubsequences: false).map{String($0)}
    let sourceKey = origTextComponents[0]
    let origText = origTextComponents[1].replacingOccurrences(of: "DEST", with: "")
    let destTextComponents = allComponents[2].split(separator: "\n", maxSplits: 1, omittingEmptySubsequences: false).map{String($0)}
    let destKey = destTextComponents[0]
    let expectedText = destTextComponents[1]

But I imagine the correct regex would cut this down to one line, whose elements I could access to initialize a struct in my next line.

Jonathan Tuzman
  • 11,568
  • 18
  • 69
  • 129
  • 1
    Show us your code. We have nothing to compile, nothing to run. I don't feel like reproducing your code just to tinker with it. – Alexander Jul 24 '18 at 03:04
  • Use `"(?<=SOURCEKEY:)[A-Z]+"` – Wiktor Stribiżew Jul 24 '18 at 07:18
  • @WiktorStribiżew that works, as a start, thank you! It gives me the character after the colon. Looks like the parentheses mean "don't capture" instead of "capture". I want to capture additional strings, I'll make it clearer in my post what I'm looking for, and then maybe you can help me figure out the expression to get all of them? – Jonathan Tuzman Jul 24 '18 at 14:27
  • What is content of `text`? – Code Different Jul 24 '18 at 14:28
  • *What am I doing wrong?* The `matches(for:in)` method in the linked question does not consider captured groups. – vadian Jul 24 '18 at 14:34
  • `text` is a long string that is known to include `SOURCEKEY:` and `DESTKEY:` (where each is followed by the key I want to extract, and then a long string of other info I want to extract). @vadian then what is it for? Or do I have a fundamental misunderstanding of what capturing means... – Jonathan Tuzman Jul 24 '18 at 14:48
  • In a match result the `range` of `NSTextCheckingResult` represents the entire match, and `range(at:` beginning with 1 represents the ranges of the captured groups `()` from left to right. The `matches(for:in)` method handles only the `range`s. – vadian Jul 24 '18 at 14:54
  • Thanks @vadian. I replaced the return statement with ` String(text[Range($0.range(at: results.index(of: $0)!), in: text)!])` which appears to work, but currently returns the same as the original line, because I don't seem to know how to construct a regex that returns multiple captured matches. Ex: the expression `"(?<=SOURCEKEY:).+"` for `"SOURCEKEY:A B C"` returns `A B C` – Jonathan Tuzman Jul 24 '18 at 15:46
  • I wrote an answer. – vadian Jul 24 '18 at 16:23

1 Answers1

1

This is an example to get the captured groups of the regular expression. The group at index 3 is the (.|\\n) expression to search across line boundaries.

let string = """
SOURCEKEY:B

a bunch of text
more lines of text
these go in the 2nd returned value, where "B" is the first returned value
everything up to...
DESTKEY:E

a bunch more text
these go in the 4th returned value, where "E" is the third returned value
this includes the remainder of the string after that 3rd value

"""

let pattern = "SOURCEKEY:([A-Z])\\s+((.|\\n)*)DESTKEY:([A-Z])\\s+((.|\\n)*)"

do {
    let regex = try NSRegularExpression(pattern: pattern)
    if let match = regex.firstMatch(in: string, range: NSRange(string.startIndex..<string.endIndex, in: string)) {
        print(string[Range(match.range, in: string)!]) // prints the entire match ignoring the captured groups
        print(string[Range(match.range(at:1), in: string)!])
        print(string[Range(match.range(at:2), in: string)!])
        print(string[Range(match.range(at:4), in: string)!])
        print(string[Range(match.range(at:5), in: string)!])
    } else {
        print("Not Found")
    }
} catch {
    print("Regex Error:", error)
}
vadian
  • 274,689
  • 30
  • 353
  • 361