0

Is it possible to configure the Swift RegexBuilder DSL Capture .url(…) to precisely capture the URL within standard Markdown link syntax? If yes, then how?

Minimal Pattern Attemp

let inputMD = "[Markdown link text](https://example.com)"

let regexMD = Regex {
    Capture { .url() }
}
let matchMD = inputMD.firstMatch(of: regexMD)
print("matchMD →", matchMD?.output ?? "nil")
// matchMD → ("https://example.com)", https://example.com))

The Minimal Pattern Attempt fails since the result https://example.com) includes the trailing ) of the Markdown link syntax.

Closing Paren Pattern Attempt

The Markdown closing paren ) is added to the following Regex:

let regexMD_2 = Regex {
    Capture { .url() }
    ")" // :ADDED:
}
let matchMD_2 = inputMD.firstMatch(of: regexMD_2)
print("matchMD_2 →", matchMD_2?.output ?? "nil")
// matchMD_2 → nil
let regexMD_3 = Regex {
    "(" // :ADDED:
    Capture { .url() }
    ")" // :ADDED:
}
let matchMD_3 = inputMD.firstMatch(of: regexMD_3)
print("matchMD_3 →", matchMD_3?.output ?? "nil")
// matchMD_3 → nil

The Closing Paren Pattern Attempt fails by returning nil.

Observation:

Here is another markdown test case:

[Bracket_(architecture) ⇗](https://en.wikipedia.org/wiki/Bracket_(architecture))

The markdown link with multiple () pairs does preview and link OK via markdown-preview-enhanced. However, the StackOverflow syntax highlighter does is not quite accurate. And, the basic .url() "as-is" is not successful.

Note: I have other regex approaches which successfully capture the URL from a Markdown link. This question is specific to the use of .url() within the RegexBuilder Capture construct.

marc-medley
  • 8,931
  • 5
  • 60
  • 66
  • The `.url()` regex allows parentheses in the host name, so it's matching on `example.com)`. It will also match on e.g. `example.c)om` This is not technically illegal according to the URI RFC, so it's not a bug as such. – JeremyP Jan 16 '23 at 14:34

0 Answers0