2

Updated

I need help with a regex that will find the "@[..[...]]" pattern.

I will try to explain.

A text will contain placeholders which will be replaced with values upon display of that very same text.

A place holder has 3 parts;

  • an open tag, starts with "@[" followed by "a dot delimited text" and ends with "[",
  • a property list, a "comma separated list" with qouted (double qoutes) values,
  • a close tag, "]]".

The property list items can contain one or many placeholders (nested) and both double qoutes (escaped) and brackets.

The regex must overcome the issues with nested placeholders by knowing when it reached the end of the "outer" placeholder as well as any escaped qoutes and brackets.

Sample

Consider the following text fragment:

Linklist    
@[Link.AppText["[startpage]", "startpage"]]
@[Link.Text["[startpage] loggedin", "The \"@[Text.AppText["startpage"]]\" for users"]]
@[Link.Text["@[Link["startpage"]]", "@[Text.AppText["startpage"]]"]]

The match should look like this:

match 1  =  @[Link.AppText["[startpage]", "startpage"]]
   Gr.1  =  Link.AppText
   Gr.2  =  "[startpage]", "startpage"

match 2  =  @[Link.Text["[startpage] loggedin", "The \"@[Text.AppText["startpage"]]\" for users"]]
   Gr.1  =  Link.Text
   Gr.2  =  "[startpage] loggedin", "The \"@[Text.AppText["startpage"]]\" for users"

match 3  =  @[Link.Text["@[Link["startpage"]]", "@[Text.AppText["startpage"]]"]]
   Gr.1  =  Link.Text
   Gr.2  =  "@[Link["startpage"]]", "@[Text.AppText["startpage"]]"

With a solution by @ridgerunner I solved it:

@\[([._\w]+)\[([^[\]""]*(?:""[^""\\]*(?:\\.[^""\\]*)*""[^[\]""]*)*)\]\]

@\[                                # Outer open delimiter.
([._\w]+)                          # 1:st group.
\[                                 # Inner open delimiter.
(                                  # Start of 2:nd group.
[^[\]""]*                          # Contents.
(?:""[^""\\]*(?:\\.[^""\\]*)*""    # Contents.
[^[\]""]*)*                        # Contents.
)                                  # End of 2:nd group.
\]\]                               # Close delimiter.

And ... for anyone who looks for a "balanced group solution"

... after struggling with google search and a lot of regex testing, I finally figured out another working solution, though I had to alter the pattern slightly to make it work: (at least for me :))

Regex:  @([._\w]+)\[\[(""(?:[^\[\]]*|\[[^\[]|[^\]]\]|(?<counter>\[\[)|(?<-counter>\]\]))+(?(counter)(?!))"")\]\]

@([._\w]+)\[\[            #   start tag, 1:st group
  (""                     #   start 2:nd group
    (?:                   #   non capturing group
      [^\[\]]*            #   any char but [ or ]
      |                   #   or
      \[[^\[]             #   if [, not followed by a [
      |                   #   or
      [^\]]\]             #   if ], not followed by a ]
      |                   #   or
      (?<counter>\[\[)    #   counter start tag
      |                   #   or
      (?<-counter>\]\])   #   counter stop tag
    )+                    #   end non capturing group
    (?(counter)(?!))      #   if counter <> 0, regex fails
  "")                     #   end 2:nd group
\]\]                      #   end tag

Updated placeholders with new pattern; (@..[[...]]

Linklist
@Link.AppText[["[startpage]", "startpage"]]
@Link.Text[["[startpage] loggedin", "The \"@Text.AppText[["startpage"]]\" for users"]]
@Link.Text[["@Link[["startpage"]]", "@Text.AppText[["startpage"]]"]]
Asons
  • 84,923
  • 12
  • 110
  • 165

3 Answers3

1

Assuming that the quoted portions won't have any escaped chars, this one will do a pretty good job:

if (Regex.IsMatch(subjectString, @"
    # Match @[...[...]...] pattern outside quotes.
    @\[                                # Outer open delimiter.
    [^[\]]*                            # Link text.
    \[                                 # Inner open delimiter.
    [^[\]""]*(?:""[^""]*""[^[\]""]*)*  # Contents.
    \]\]                               # Close delimiter.
    ", RegexOptions.IgnorePatternWhitespace)) {
    // Successful match
} else {
    // Match attempt failed
} 

Note that if quoted contents does contain escaped chars (e.g. "foo\"bar\"foo", or in .NET double-quote syntax: @"foo""bar""foo"), the pattern can be modified to handle that too.

ridgerunner
  • 33,777
  • 5
  • 57
  • 69
  • Thanks .. I updated my question but need a fix for escaped chars. If you can help me with that it would be great. – Asons Nov 28 '13 at 06:54
  • Could you (or any one) explain this part of the #Contents regex: `"[^[\]"]*` – Asons Nov 28 '13 at 08:25
  • @PellePenna - `[^[\]""]*` matches non-quoted portions of the content (zero or more non-square brackets, non-quotes). If a quote is encountered, then the `(?:""[^""]*""[^[\]""]*)*` part kicks in and matches the quoted portion (which may contain square brackets), followed by more non-quoted content. – ridgerunner Nov 28 '13 at 16:27
  • Wonderful ... and what do I need to add to be able to have escaped double qoutes within qouted content ? – Asons Nov 28 '13 at 17:10
  • @PellePenna - Change: `""[^""]*""` to: `""[^""\\]*(?:\\.[^""\\]*)*""`. (This allows escaped _anything_ inside the quotes) – ridgerunner Nov 28 '13 at 18:12
0

What does this do ?

 #  @"(?-s)@\[([.\w]+)\[""(.*)""\]\]"

 (?-s)
 @\[
 ( [.\w]+ )
 \["
 ( .* )
 "\]\]
  • As I wrote above, it gives me 1 match with 2 groups, which is not what I'm looking for. – Asons Nov 27 '13 at 16:54
  • Hard to tell what you're looking for. This regex does what you said you wanted. If you want to do nested @'s you have to use Balanced groups. Check out MS web site to find out how. I'd do it for you but I,m a little busy right now. –  Nov 27 '13 at 17:20
  • As the @'s is nested, I need to find a solution which let me have them nested and as well be able to use the single "double quotes" and "brackets" within the texts ... but of course not combined as in the open/close delimeters – Asons Nov 27 '13 at 17:56
0

This might help. outer group will contain Link.AppText part and inner group will contain the inner section.

@\[(?<outer>[^[]+?)\["(?<Inner>.+)"\]\]
Mat J
  • 5,422
  • 6
  • 40
  • 56
  • Thanks for suggestion, though it gives the same result as @[([._\w]+)["(.*)"]] but with named groups, which is not what I'm looking for. – Asons Nov 27 '13 at 16:56