3

I had a regex to find single if-then-else condition.

string pattern2 = @"if( *.*? *)then( *.*? *)(?:else( *.*? *))?endif"; 

Now, I need to extend this & provide looping if conditions. But the regex is not suitable to extract the then & else parts properly.

Example Looped IF condition:

if (2 > 1) then ( if(3>2) then ( if(4>3) then 4 else 3 endif ) else 2 endif) else 1 endif

Expected Result with Regex:

condition = (2>1) then part = ( if(3>2) then ( if(4>3) then 4 else 3 endif ) else 2 endif) else part = 1

I can check if else & then part have real values or a condition. Then i can use the same regex on this inner condition until everything is resolved.

The current regex returns result like:

condition = (2 > 1) then part = ( if( 3>2) then ( if(4>3) then 3 else part = 3

Meaning, it returns the value after first "else" found. But actually, it has to extract from the last else.

Can someone help me with this?

Triad sou.
  • 2,969
  • 3
  • 23
  • 27
Uma Ilango
  • 968
  • 4
  • 16
  • 31
  • 3
    In general regular expressions can *not* deal with arbitrary nested expressions. – sverre Sep 29 '11 at 11:49
  • 3
    Please provide an information about the language you are using this RegEx in. – Erik Sep 29 '11 at 12:06
  • You can't correctly parse a non regular language with a regular expression. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Jonas Elfström Sep 29 '11 at 12:46
  • Maybe I'm confusing something with another, but what regular expression syntax/library is this? – user Sep 29 '11 at 12:48
  • @Jonas: .NET "regular" expressions are (like most other current regex flavors) far from only being able to match regular languages. They support recursion and thus can match arbitrarily nested constructs. However, performance of recursive regexes can be very poor as compared to a parser (and it's much harder to write a regex that will work correctly). – Tim Pietzcker Sep 29 '11 at 12:54

2 Answers2

5

You can adapt the solution on answer Can regular expressions be used to match nested patterns? ( http://retkomma.wordpress.com/2007/10/30/nested-regular-expressions-explained/ ).

That solution shows how to match content between html tags , even if it contains nested tags. Applying the same idea for parenthesis pairs should solve your problem.

EDIT:

using System;
using System.Text.RegularExpressions;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            String matchParenthesis = @" 
                (?# line 01) \((

                (?# line 02) (?> 

                (?# line 03) \( (?<DEPTH>) 

                (?# line 04) | 

                (?# line 05) \) (?<-DEPTH>) 

                (?# line 06) | 

                (?# line 07) .? 

                (?# line 08) )* 

                (?# line 09) (?(DEPTH)(?!)) 

                (?# line 10) )\) 

                ";

            //string source = "if (2 > 1) then ( if(3>2) then ( if(4>3) then 4 else 3 endif ) else 2 endif) else 1 endif"; 
            string source = "if (2 > 1) then 2 else ( if(3>2) then ( if(4>3) then 4 else 3 endif ) else 2 endif) endif";
            string pattern = @"if\s*(?<condition>(?:[^(]*|" + matchParenthesis + @"))\s*";
            pattern += @"then\s*(?<then_part>(?:[^(]*|" + matchParenthesis + @"))\s*";
            pattern += @"else\s*(?<else_part>(?:[^(]*|" + matchParenthesis + @"))\s*endif";


            Match match = Regex.Match(source, pattern, 
                    RegexOptions.IgnorePatternWhitespace | RegexOptions.IgnoreCase); 

            Console.WriteLine(match.Success.ToString()); 
            Console.WriteLine("source: " + source ); 
            Console.WriteLine("condition = " + match.Groups["condition"]); 
            Console.WriteLine("then part = " + match.Groups["then_part"]); 
            Console.WriteLine("else part = " + match.Groups["else_part"]); 
        }
    }
}
Community
  • 1
  • 1
mMontu
  • 8,983
  • 4
  • 38
  • 53
  • Thanks. It works. But when I add a condition within else, it does not work. For ex, "if (2 > 1) then 2 else ( if(3>2) then ( if(4>3) then 4 else 3 endif ) else 2 endif) endif" Can you pls help? – Uma Ilango Sep 29 '11 at 19:51
  • It wasn't clear from your question that the parenthesis pair is optional. I've updated the code assuming that parenthesis pair is optional on all parts. – mMontu Sep 29 '11 at 20:56
1

If you replace endif with end you get

if (2 > 1) then ( if(3>2) then ( if(4>3) then 4 else 3 end) else 2 end) else 1 end

and you also got a perfectly fine Ruby expression. Download IronRuby and add references to IronRuby, IronRuby.Libraries, and Microsoft.Scripting to your project. You find them in C:\Program Files\IronRuby 1.0v4\bin then

using Microsoft.Scripting;
using Microsoft.Scripting.Hosting;
using IronRuby;

and in your code

var engine = Ruby.CreateEngine();
int result = engine.Execute("if (2 > 1) then ( if(3>2) then ( if(4>3) then 4 else 3 end ) else 2 end) else 1 end");
Jonas Elfström
  • 30,834
  • 6
  • 70
  • 106