1

I am trying to match balancing braces ({}) in strings. For example, I want to balance the following:

if (a == 2)
{
  doSomething();
  { 
     int x = 10;
  }
}

// this is a comment

while (a <= b){
  print(a++);
} 

I came up with this regex from MSDN, but it doesn't work well. I want to extract multiple nested matching sets of {}. I am only interested in the parent match

   "[^{}]*" +
   "(" + 
   "((?'Open'{)[^{}]*)+" +
   "((?'Close-Open'})[^{}]*)+" +
   ")*" +
   "(?(Open)(?!))";
John Saunders
  • 160,644
  • 26
  • 247
  • 397
  • 1
    It's impossible for a standard regex. See: http://stackoverflow.com/questions/133601/can-regular-expressions-be-used-to-match-nested-patterns – tzaman Feb 06 '12 at 04:18
  • 2
    But C#/.NET supports balancing groups, so it should be possible ? –  Feb 06 '12 at 04:39

1 Answers1

6

You're pretty close.

Adapted from the second answer on this question (I use that as my canonical "balancing xxx in C#/.NET regex engine" answer, upvote it if it helped you! It has helped me in the past):

var r = new Regex(@"
[^{}]*                  # any non brace stuff.
\{(                     # First '{' + capturing bracket
    (?:                 
    [^{}]               # Match all non-braces
    |
    (?<open> \{ )       # Match '{', and capture into 'open'
    |
    (?<-open> \} )      # Match '}', and delete the 'open' capture
    )+                  # Change to * if you want to allow {}
    (?(open)(?!))       # Fails if 'open' stack isn't empty!
)\}                     # Last '}' + close capturing bracket
"; RegexOptions.IgnoreWhitespace);
Community
  • 1
  • 1
mathematical.coffee
  • 55,977
  • 11
  • 154
  • 194
  • Nice. Thanks. I just have a minor problem I am trying to fix. It's reading a lot of unrelated lines before the first brace { –  Feb 06 '12 at 04:58
  • Is this a way to use this -- var input = "abc{def{123}}"; Match m = Regex.Matches(input, pattern); – sgtz Oct 10 '12 at 05:39
  • [^{}]*\{((?:[^{}]|(?\{)|(?<-open>\}))+(?(open)(?!)))\} – liuhongbo Apr 13 '13 at 04:00
  • I don't believe this works. It fails on a pretty simple case. return Regex.IsMatch"{{}}", @"[^{}]*\{((?:[^{}]|(?\{)|(?<-open>\}))+(?(open)(?!)))\}"); // returns True return Regex.IsMatch("{{{}{}}", @"[^{}]*\{((?:[^{}]|(?\{)|(?<-open>\}))+(?(open)(?!)))\}"); // returns True, Expected False – JJS Apr 16 '13 at 19:25
  • 1
    @JJS it returns true on the last one because it matches the balanced part `{{}{}}` not the entire string. If you want to force it against the entire string anchor the pattern with `^` and `$` – mathematical.coffee Apr 16 '13 at 23:56
  • Thanks for the tip @mathematical.coffee. Regex.IsMatch("{{{}{}}", @"^[^{}]*\{((?:[^{}]|(?\{)|(?<-open>\}))+(?(open)(?!)))\}$") does in fact return False, as I expected. – JJS Apr 17 '13 at 02:47
  • Is there a way to use the same balancing for for all types of braces (e.g. parenthesis, curly braces and square brackets)? – Hiral Desai Oct 11 '15 at 22:37