1

New to Regex. I want to validate to this format:

  • Any character allowed, except '{' and '}'.
  • A '{' char must be followed by one of specific strings

After these strings any character can go

  • Each '{' must have a closing '}'
  • Nesting of '{'s is allowed

Example:

abc{FILE:any text} def {FILE:mno{ENV:xyz}}

FILE: and ENV: are an example of specific strings required after a '{' char. I wrote this regex:

^
(
  [^\{\}]+
  |
  (?<Depth>\{)(FILE:|ENV:)
  |
  (<-Depth>\})
)*
(?(Depth)(?!))
$

but it doesn't match my desired format. What i miss?
Thanks a lot.
EDIT: Links that do the same, succesfully i hope:-) MSDN, Other site

RoadBump
  • 733
  • 7
  • 16
  • 3
    Regex should not be used to match recursive elements. You must compliment a single-depth regular expression with some recursive .NET code instead. – qJake Jun 13 '12 at 18:46
  • 2
    @SpikeX is right, anything with nested code is a regular grammar, which is one step above regular expressions in complexity, meaning regex would be unable to parse it satisfactorily. Also, check out the top answer in this famous post http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Hans Z Jun 13 '12 at 18:48
  • @Hans A mod removed my comment last time I linked to that post.... he must enjoy writing recursive regex in his spare time... :) – qJake Jun 13 '12 at 18:50
  • 2
    *"a regular grammar, which is one step above regular expressions in complexity"* somehow that doesn't sound right. ;-) – Qtax Jun 13 '12 at 18:50
  • @Qtax No, it is... although it's probably not the beat way to word it. See this: http://en.wikipedia.org/wiki/Chomsky_hierarchy – qJake Jun 13 '12 at 18:51
  • 1
    "Regular expressions describe regular languages in formal language theory. They have the same expressive power as regular grammars."[(1)](http://en.wikipedia.org/wiki/Regular_expression#Formal_language_theory) But most know that .NET regex are a lot more powerful than formal theory regular expressions. – Qtax Jun 13 '12 at 18:52
  • This is true, but that doesn't mean a *regular* expression should be used to parse an *irregular* syntax/grammar, even if Microsoft decided to beef up their regular expressions a bit. ;) – qJake Jun 13 '12 at 18:55
  • @SpikeX You're right, but i saw examples that do exactly this. This is not a recursion, this is a play with balancing groups. (MS... giving you 10 complex ways but the simple one...) links [MSDN](http://msdn.microsoft.com/en-us/library/bs2twtah.aspx#balancing_group_definition) [Other site](http://blog.stevenlevithan.com/archives/balancing-groups) – RoadBump Jun 13 '12 at 18:55
  • Let me put it this way. You're going to have a hell of a time parsing something like: `{FILE:a{ENV:x}bc{FILE:def{ENV:bcd{FILE:ab}123}5}}` (which, based on your original post, is valid) if you don't write a recursive .NET method to assist your regular expression. – qJake Jun 13 '12 at 19:02
  • @SpikeX: Qtax is right, you mean a "context-free grammar", not a regular grammar. Regular grammars (left-linear or right-linear) can be used anywhere a regular expression can be used, *and vice versa*. – Cameron Jun 13 '12 at 19:02
  • @SpikeX Right, i gonna write a recursive method to parse it, all i want is to spare the need to validate it during parsing. Anyway, thank you all. Great site and great people! – RoadBump Jun 13 '12 at 19:24

1 Answers1

3

You forgot the question mark in the balancing group.

string pattern = @"(?x)
^
(?:
    [^{}]+
    |
    (?<Depth>{) (?:FILE|ENV):
    |
    (?<-Depth>})
)*
(?(Depth)(?!))
$
";

Should match strings like a {FILE: {ENV: foo } bar } baz

Qtax
  • 33,241
  • 9
  • 83
  • 121