I am trying to extract some useful data (placeholders with specific parameters) from a text (some are raw text and some are xml).
The useful parts are delimited with either one of these $, %, [], {}
The examples bellow are given with a $
and shows the different possible content that i'm intrested in.
$EX1$ -> EX1
$EX2(a$b$c)$ -> EX2, (, a$b$c
$EX3(abc\x/)$ -> EX3, (, abc\x/
$EX4(\@\,/&/)$ -> EX4, (, \@\,/&/
$EX5/X(Z)Y/$ -> EX5, /, X(Z)Y
$EX6/X(ABC)/1$ -> EX6, /, X(ABC), 1
$EX7/X\\Z\/Y/$ -> EX7, /, X\\Z\/Y
$EX8/(A)/(B)/$ -> EX8, /, (A), (B)
$EX9/(\\$A$)\//(\\$B$\/)/$ -> EX9, /, (\\$A$)\/, (\\$B$\/)
The first part is the placeholder name, optionally followed by some parameters like (...)
or /.../
or /.../xx
or /.../.../
Where xx
is a number and ...
can be anything.
I've built the following regex witch almost does the job and I'm wondering if there is a way to improve it or even if there's another approach maybe to do the job (It must be compatible with .NET regex engine)
\$
(?=[^$]{3,100}\$)
(?<PH>[A-Za-z0-9:_-]{1,20})
(?:
(?<C1>\/)
(?<RX>(?:[^\\\/\r\n]|\\\/?)*)
\/
(?:
(?<R>(?:[^\\\/\r\n$]|\\[\/$]?)*)
\/
|
(?<G>\d*)
)
|
(?:
(?<C2>\()
(?<F>(?:[^\t\r\n\f()]|\\[()]?)*)
\)
)?
)
\$