10

I am going to write an expression evaluator which only does addition and subtraction. I have a simple algorithm to do that; but, I have some implementation problems.

I considered an expression as (it is a String)

"(" <expression1> <operator> <expression2> ")"

Here is my algorithm

String evaluate( String expression )

   if expression is digit
      return expression

   else if expression is "(" <expression1> <operator> <expression2> ")"
      cut the brackets out of it
      expression1 = evaluate( <expression1> )
      operator = <operator>
      expression2 = evaluate( <expression2> )

   if operator is +
      expression1 + expression2

   else if operator is -
      expression1 - expression2 

My problem is parsing <expression1>, <operator> and <expression2> from the expression. How can I do that?

Note: I'm not asking for a code. All I need is an idea to do that.

Thank you,

-Ali

629
  • 308
  • 1
  • 6
  • 15
  • If you are interested in a working example of a small Java math evaluator written in precisely this way, I have one on my website: http://www.softwaremonkey.org/Code/MathEval – Lawrence Dol Nov 02 '10 at 00:16

6 Answers6

7

My problem is parsing <expression1>, <operator> and <expression2> from the expression

Don't do that, then :) When you see an opening bracket, do your recursive call to expression. At the end of the expresssion, either you find another operator (and so you're not at the end of the expression after all), or a right-bracket, in which case you return from the evaluate.

The Archetypal Paul
  • 41,321
  • 20
  • 104
  • 134
  • Well, to do a recursive call to parse expression1, he would basically need to count parenthesis in order to tell where expression1 ended, but otherwise I like your answer. – aioobe Nov 01 '10 at 21:26
  • 1
    Not really. The recursion does the counting for you. If you run across a ) at a point where an expression could end, that's the end of this recursive call. This is how recursive-descent parsers work... – The Archetypal Paul Nov 01 '10 at 21:28
  • 1
    Ah, so you recurse on the tail of the string, even if it's ill-balanced? – aioobe Nov 01 '10 at 21:36
  • 1
    I think so, if I understand you correctly. If you see a (, you recurse. Either you hit the end of the input (in which case, error) or you see a balancing ) and return from this recursion. If you see a ) after returning to the top level, that's an error too. This is what (recursive descent) parser generators will produce, but it's educational to implement one yourself. That's why they're called recursive descent, in fact! – The Archetypal Paul Nov 01 '10 at 21:45
  • You don't have to do any of that. Your term() and factor() and prime() methods should just return if the next token isn't something they can handle. So when expression() returns into the code that called it because of the '(', the next token should be ')'. If it isn't, it's missing. – user207421 Nov 02 '10 at 06:33
3

Either you use a parser generator such as JavaCUP or ANTLR. Write up a BNF of your expression and generate a parser. Here is a sample grammar that would get you started:

Expression ::= Digit
            |  LeftBracket Expression Plus Expression RightBracket
            |  LeftBracket Expression Minus Expression RightBracket
            |  LeftBracket Expression RightBracket

A "hacky" way of doing it yourself would be to look for the first ) backtrack to the closest ( look at the parenthesis free expression in between, simply split on the operator symbols and evaluate.

aioobe
  • 413,195
  • 112
  • 811
  • 826
3

Use a StringTokenizer to split your input string into parenthesis, operators and numbers, then iterate over your tokens, making a recursive call for every open-parens, and exiting your method for every close parenthesis.

I know you didn't ask for code, but this works for valid input:

public static int eval(String expr) {
    StringTokenizer st = new StringTokenizer(expr, "()+- ", true);
    return eval(st);
}

private static int eval(StringTokenizer st) {
    int result = 0;
    String tok;
    boolean addition = true;
    while ((tok = getNextToken(st)) != null) {
        if (")".equals(tok))
            return result;
        else if ("(".equals(tok))
            result = eval(st);
        else if ("+".equals(tok))
            addition = true;
        else if ("-".equals(tok))
            addition = false;
        else if (addition)
            result += Integer.parseInt(tok);
        else
            result -= Integer.parseInt(tok);
    }
    return result;
}

private static String getNextToken(StringTokenizer st) {
    while (st.hasMoreTokens()) {
        String tok = st.nextToken().trim();
        if (tok.length() > 0)
            return tok;
    }
    return null;
}

It would need better handling of invalid input, but you get the idea...

Luke Hutteman
  • 1,921
  • 13
  • 11
  • I didn't understand why did you use getNextToken() instead of using nextToken() ? – 629 Nov 01 '10 at 22:30
  • It doesn't handle parentheses or operator precedence correctly, and it never will until you introduce recursion into it, or an operand stack. – user207421 Nov 02 '10 at 06:44
  • Parenthesis are handled correctly, and since addition and subtraction (the only 2 operations needed) have the same precedence, there's no need to add any additional logic for that. If you were to want multiplication and division, then yes, you'd need an operand stack. – Luke Hutteman Nov 02 '10 at 13:05
  • @ECP: Mea Culpa - I see you're right about the mishandling of parenthesis; my unnecessary recursive call for simple addition or subtraction was messings things up there... that's what I get for trying to hack some code together in 5 minutes :p I fixed the code to remove this unnecessary recursion. – Luke Hutteman Nov 02 '10 at 19:10
  • @alicozgo getNextToken() is used to skip whitespace, though in retrospect that could've just been ignored by eval() itself as well. And you're right, this is essentially the same solution Paul suggested. – Luke Hutteman Nov 02 '10 at 19:14
3

I would recommend changing the infix input into postfix and then evaluating it, rather than reducing the expression infix-wise. There are already well defined algorithms for this and it doesn't come with the inherent multiple-nested-parentheses parsing problems.

Take a look at the Shunting Yard Algorithm to convert to postfix/RPN then evaluate it using a stack using Postfix Operations. This is fast (O(n)) and reliable.

HTH

Callum Rogers
  • 15,630
  • 17
  • 67
  • 90
1

I would suggest taking an approach that more closely resembles the one described in this old but (in my opinion) relevant series of articles on compiler design. I found that the approach of using small functions/methods that parse parts of the expression to be highly effective.

This approach allows you to decompose your parsing method into many sub-methods whose names and order of execution closely follows the EBNF you might use to describe the expressions to be parsed.

Martin Törnwall
  • 9,299
  • 2
  • 28
  • 35
-2

Perhaps create regular expressions for expression and operator and then use matching to identify and break out your content.

jarmod
  • 71,565
  • 16
  • 115
  • 122