1

I have an arithmetic expression

string exp = "((2+3.1)/2)*4.456";

I want to validate by using regular expression. The expression can only have integers, floating point numbers, operands and parenthesis.

How can i generate regular expression to validate please help or suggest any other way to validate that string.

Masood
  • 15
  • 1
  • 6
  • 1
    It'd be too complex. You'll have to take care of situations like `(*)` for example. – Andrew Logvinov Feb 25 '12 at 10:45
  • I would advise you not to go there. It's like trying to parse XML with R.E. - it's doesn't work. – AVIDeveloper Feb 25 '12 at 10:47
  • 1
    Impossible, R.E. cannot do that – Daniel Feb 25 '12 at 10:47
  • any other way to validate that expression?? – Masood Feb 25 '12 at 10:50
  • 1
    You need some sort of arithmetic operations parser. – Maciej Feb 25 '12 at 10:56
  • @Dani, it's not impossible at all, nor very complicated, but it's not what OP really wants I would think. – Qtax Feb 25 '12 at 10:59
  • You'll need to create an expression evaluation tree. See [this article](http://www.arstdesign.com/articles/expression_evaluation.html) as a starting point. – AVIDeveloper Feb 25 '12 at 10:59
  • @Qtax: R.E. cannot parse a recursive expression tree – Daniel Feb 25 '12 at 11:00
  • @Dani, as I said, it can do that just fine. This is not formal language theory regex we are talking about here. – Qtax Feb 25 '12 at 11:05
  • If you need to evaluate mathematical expression you could use [this](http://ncalc.codeplex.com/) – digEmAll Feb 25 '12 at 11:11
  • @Dani, I'm not sure about C# but with PCRE and friends it's as easy as `(?R)` and the like. Don't be misslead by the name, "regex", see http://stackoverflow.com/questions/7434272/match-an-bn-cn-e-g-aaabbbccc-using-regular-expressions-pcre and http://stackoverflow.com/questions/4231382/regular-expression-pattern-not-matching-anywhere-in-string/4234491#4234491 etc. – Qtax Feb 25 '12 at 11:12
  • @Dani, I was curious about C# solution, so I posted an answer. – Qtax Feb 25 '12 at 15:00

2 Answers2

7

Using Perl/PCRE we could verify such simple arithmetic expressions with help of a pattern structured like:

expr = pnum ( op pnum )*
pnum = num | \( expr \)

Where num and op defined as required. For example:

num = -?+\d++(?:\.\d++)?+
op = [-+*/]

Which would give us the following working expression:

(?x)^ (?&expr) $

(?(DEFINE)
    (?<expr> (?&pnum) (?: (?&op) (?&pnum) )*+   )
    (?<pnum> (?> (?&num) | \( (?&expr) \) )   )
    (?<num>  -?+\d++(?:\.\d++)?+   )
    (?<op>   [-+*/]   )
)

But such expressions could not be used with .NET regex as it does not support (recursive) suppatern calls (?&name). Instead .NET regex lib offers us its special feature: balancing groups.

With balancing groups we could rewrite the required recursive call used in pnum, and use a structure like this instead:

expr = pnum ( op pnum )* (?(p)(?!))
pnum = (?> (?<p> \( )* num (?<-p> \) )* )

What we've done here is to allow any number of optional opening and closing paranthesis before and after every number, counting the total number of open parentheses (?<p> \( ), subtracting closing parentheses from that number (?<-p> \) ) and at the end of the expression make sure that the number of open parentheses is 0 (?(p)(?!)).

(I believe this is equivalent to the original structure, altho I haven't made any formal proof.)

Resulting in the following .NET pattern:

(?x)
^
(?> (?<p> \( )* (?>-?\d+(?:\.\d+)?) (?<-p> \) )* )
(?>(?:
    [-+*/]
    (?> (?<p> \( )* (?>-?\d+(?:\.\d+)?) (?<-p> \) )* )
)*)
(?(p)(?!))
$

C# Example:

using System;
using System.Text.RegularExpressions;

namespace RegexTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var expressions = new string[] {
                "((2+3.1)/2)*4.456",
                "1",
                "(2)",
                "2+2",
                "(1+(2+3))",
                "-2*(2+-2)",
                "1+(3/(2+7-(4+3)))",
                "1-",
                "2+2)",
                "(2+2",
                "(1+(2+3)",
            };

            var regex = new Regex(@"(?x)
                ^
                (?> (?<p> \( )* (?>-?\d+(?:\.\d+)?) (?<-p> \) )* )
                (?>(?:
                    [-+*/]
                    (?> (?<p> \( )* (?>-?\d+(?:\.\d+)?) (?<-p> \) )* )
                )*)
                (?(p)(?!))
                $
            ");

            foreach (var expr in expressions)
            {
                Console.WriteLine("Expression: " + expr);
                Console.WriteLine("    Result: " + (regex.IsMatch(expr) ? "Matched" : "Failed"));
            }
        }
    }
}

Output:

Expression: ((2+3.1)/2)*4.456
    Result: Matched
Expression: 1
    Result: Matched
Expression: (2)
    Result: Matched
Expression: 2+2
    Result: Matched
Expression: (1+(2+3))
    Result: Matched
Expression: -2*(2+-2)
    Result: Matched
Expression: 1+(3/(2+7-(4+3)))
    Result: Matched
Expression: 1-
    Result: Failed
Expression: 2+2)
    Result: Failed
Expression: (2+2
    Result: Failed
Expression: (1+(2+3)
    Result: Failed
Qtax
  • 33,241
  • 9
  • 83
  • 121
1

You could write a simple lexer in F# using fslex/fsyacc. Here is an example which is very close to your requirement: http://blogs.msdn.com/b/chrsmith/archive/2008/01/18/fslex-sample.aspx

user1227804
  • 390
  • 1
  • 5