As several comments say, what you ask for is impossible with just a regex match. In fact, matching balanced parentheses is one of the classic "problems that cannot be solved by a simple regular expression". As long as your mathematical expressions can contain arbitrarily nested parentheses, you can't validate it with a regex.
However, it is possible to validate a smaller language, and we can then build that up into a validation routine for your language with a little bit of coding. The smaller language is just like your language but with one change: no parentheses allowed. Then, valid expressions in the language look like this:
INTEGER OP INTEGER OP INTEGER OP .... OP INTEGER
Another way to say that is "an INTEGER
followed by zero or more OP
INTEGER
sequences". This can be translated into a regex, as:
Pattern simpleLang = Pattern.compile("-?\\d+([-+*%/]-?\\d+)*");
So -?\d+
means INTEGER
, and [-+*%/]
means OP
. Okay, now how do we use this? Well, first off let's modify it to add arbitrary spaces in there between integers, and make the pattern a static
, because we're going to wrap this validation logic up in a class:
static Pattern simpleLang = Pattern.compile("\\s*-?\\d+(\\s*[-+*%/]\\s*-?\\d+)*\\s*");
(Though note that we don't allow a space between a negative sign and the number that follows it, so 3 - - 4
isn't allowed, even though 3 - -4
is allowed)
Now, to validate the full language, what we need to do is repeatedly find a chunk that's at the innermost parenthesized level (so, a chunk containing no parens itself but surrounded by a open-close paren pair), validate that the stuff inside the parens matches the simple language, and then replace that chunk (including the surrounding parens) with some integer, surrounded by spaces so that it's considered separate from the surrounding stuff. So the logic is something like this:
expr
coming in is 11 - (7 * 15 % (11 - 2) / 4)
- Innermost parenthesized chunk is
11 - 2
- Does
11 - 2
match the simple language? Yes!
- replace
(11 - 2)
with some integer. For example, with 1
.
expr
is now 11 - (7 * 15 % 1 / 4)
- Innermost parenthesized chunk is
7 * 15 % 1 / 4
- Does
7 * 15 % 1 / 4
match the simple language? Yes!
- replace
(7 * 15 % 1 / 4)
with some integer. For example, with 1
.
expr
is now 11 - 1
- No more parens, so ask: does
expr
match the simple language? Yes!
In code this works out to:
static Pattern simpleLang = Pattern.compile("\\s*-?\\d+(\\s*[-+*%/]\\s*-?\\d+)*\\s*");
static Pattern innerParen = Pattern.compile("[(]([^()]*)[)]");
public static boolean validateExpr(String expr) {
while (expr.contains(")") || expr.contains("(")) {
Matcher m = innerParen.matcher(expr);
if (m.find()) {
if (!simpleLang.matcher(m.group(1)).matches()) {
return false;
}
expr = expr.substring(0,m.start()) + " 1 " + expr.substring(m.end());
} else {
// we have parens but not an innermost paren-free region
// This implies mismatched parens
return false;
}
}
return simpleLang.matcher(expr).matches();
}
Note that there is one expression you called "valid" that this will not call valid: namely, the expression 13(12)+11-(7*15%(11-2)/4)
. This will be considered invalid because there is no operator between 13 and 12. If you wish to allow that sort of implicit multiplication, the easiest way to do it is to add
(the space character) as an allowed operator in the simple language, so change simpleLang
to:
static Pattern simpleLang = Pattern.compile("\\s*-?\\d+(\\s*[-+ *%/]\\s*-?\\d+)*\\s*");