12

I am trying to figure out how to use C# regular expressions to remove all instances paired parentheses from a string. The parentheses and all text between them should be removed. The parentheses aren't always on the same line. Also, their might be nested parentheses. An example of the string would be

This is a (string). I would like all of the (parentheses
to be removed). This (is) a string. Nested ((parentheses) should) also
be removed. (Thanks) for your help.

The desired output should be as follows:

This is a . I would like all of the . This  a string. Nested  also
be removed.  for your help.
Matt Brandon
  • 321
  • 5
  • 13

4 Answers4

22

Fortunately, .NET allows recursion in regexes (see Balancing Group Definitions):

Regex regexObj = new Regex(
    @"\(              # Match an opening parenthesis.
      (?>             # Then either match (possessively):
       [^()]+         #  any characters except parentheses
      |               # or
       \( (?<Depth>)  #  an opening paren (and increase the parens counter)
      |               # or
       \) (?<-Depth>) #  a closing paren (and decrease the parens counter).
      )*              # Repeat as needed.
     (?(Depth)(?!))   # Assert that the parens counter is at zero.
     \)               # Then match a closing parenthesis.",
    RegexOptions.IgnorePatternWhitespace);

In case anyone is wondering: The "parens counter" may never go below zero (<?-Depth> will fail otherwise), so even if the parentheses are "balanced" but aren't correctly matched (like ()))((()), this regex will not be fooled.

For more information, read Jeffrey Friedl's excellent book "Mastering Regular Expressions" (p. 436)

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • @MattBrandon - There is an even easier way to do this in .NET: [Balancing Group Definitions](http://msdn.microsoft.com/en-us/library/bs2twtah.aspx#balancing_group_definition). – JDB Jan 19 '13 at 02:39
  • @Cyborgx37: What do you mean by "an even easier way"? I *am* using exactly the technique you linked to (thanks for the link - I've included it in my answer). I just use a different name for the counter (`Depth` instead of `Open`) which is of course irrelevant. – Tim Pietzcker Jan 19 '13 at 07:52
  • Also, I usually don't worry about downvotes, but in this case I would be very interested in learning why this answer has been deemed "not helpful" by someone. – Tim Pietzcker Jan 19 '13 at 07:53
2

You can repetitively replace /\([^\)\(]*\)/g with the empty string till no more matches are found, though.

flup
  • 26,937
  • 7
  • 52
  • 74
1

How about this: Regex Replace seems to do the trick.

string Remove(string s, char begin, char end)
{
    Regex regex = new Regex(string.Format("\\{0}.*?\\{1}", begin, end));
    return regex.Replace(s, string.Empty);
}


string s = "Hello (my name) is (brian)"
s = Remove(s, '(', ')');

Output would be:

"Hello is"
Botonomous
  • 1,746
  • 1
  • 16
  • 39
1

Normally, it is not an option. However, Microsoft does have some extensions to standard regular expressions. You may be able to achieve this with Grouping Constructs even if it is faster to code as an algorithm than to read and understand Microsoft's explanation of their extension.

Alexandre Rafalovitch
  • 9,709
  • 1
  • 24
  • 27
  • I actually ended up solving this problem earlier today by just coding an algorithm to do the work. However, it left me very curious as to whether or not it could be done with Regex – Matt Brandon Jan 18 '13 at 21:37