0

This is my situation: the input is a string that contains a normal mathematical operation like 5+3*4. Functions are also possible, i.e. min(5,A*2). This string is already tokenized, and now I want to parse it using stacks (so no AST). I first used the Shunting Yard Algorithm, but here my main problem arise:

Suppose you have this (tokenized) string: min(1,2,3,+) which is obviously invalid syntax. However, SYA turns this into the output stack 1 2 3 + min(, and hopefully you see the problem coming. When parsing from left to right, it sees the + first, calculating 2+3=5, and then calculating min(1,5), which results in 1. Thus, my algorithm says this expression is completely fine, while it should throw a syntax error (or something similar).

What is the best way to prevent things like this? Add a special delimiter (such as the comma), use a different algorithm, or what?

orangeman
  • 59
  • 6
  • There's pseudocode for a Shunting-Yard parser which understands function calls in [this answer](https://stackoverflow.com/a/16392115/1566221). Note that when the pseudocode tells you to "mark `(` as a postfix operator", that means you need to treat it as a function call. – rici Jan 27 '19 at 02:47

1 Answers1

1

In order to prevent this issue, you might have to keep track of the stack depth. The way I would do this (and I'm not sure it is the "best" way) is with another stack.

The new stack follows these rules:

  • When an open parentheses, (, or function is parsed, push a 0.
    • Do this in case of nested functions
  • When a closing parentheses, ), is parsed, pop the last item off and add it to the new last value on the stack.
    • The number that just got popped off is how many values were returned by the function. You probably want this to always be 1.
  • When a comma or similar delimiter is parsed, pop from the stack, add that number to the new last element, then push a 0.
    • Reset so that we can begin verifying the next argument of a function
    • The value that just got popped off is how many values were returned by the statement. You probably want this to always be 1.
  • When a number is pushed to the output, increment the top element of this stack.
    • This is how many values are available in the output. Numbers increase the number of values. Binary operators need to have at least 2.
  • When a binary operator is pushed to the output, decrement the top element
    • A binary operator takes 2 values and outputs 1, thus reducing the overall number of values left on the output by 1.
    • In general, an n-ary operator that takes n values and returns m values should add (m-n) to the top element.
    • If this value ever becomes negative, throw an error!

This will find that the last argument in your example, which just contains a +, will decrement the top of the stack to -1, automatically throwing an error.

But then you might notice that a final argument in your example of, say, 3+ would return a zero, which is not negative. In this case, you would throw an error in one of the steps where "you probably want this to always be 1."

Zeda
  • 382
  • 4
  • 13