Modify piece of code to allow floats and negative numbers as well as an arbitrary amount of spaces in between the characters in the inputted string

Question

The following piece of code gets an infix string and will convert it to postfix and output that new expression as a string. However it does not support negative numbers or floats. The following code only allows for single digit values:

Such as (0-9) nothing like 10 or 11. Otherwise it throws a "key error". Also if I add a negative sign it also throws a key error.

class Stack:
    def __init__(self):
        self.items = []

    def isEmpty(self):
        return self.items == []

    def push(self, item):
        self.items.append(item)

    def pop(self):
        return self.items.pop()

    def peek(self):
        return self.items[len(self.items)-1]

    def size(self):
        return len(self.items)

    def isNumber(self, txt):
        if not isinstance(txt,str) or len(txt.strip())==0:
            print("Argument error in isNumber")
            return False
        # YOUR CODE STARTS HERE
        try:
            float(txt)
            return True
        except ValueError:
            return False

#########################################################################################################

    def infixToPostfix(infixexpr):
        prec = {}
        prec["^"] = 4
        prec["*"] = 3
        prec["/"] = 3
        prec["+"] = 2
        prec["-"] = 2
        prec["("] = 1
        opStack = Stack()
        postfixList = []
        tokenList = infixexpr.split()

        for token in tokenList:
            if token in "0123456789":
                postfixList.append(token)
            elif token == '(':
                opStack.push(token)
            elif token == ')':
                topToken = opStack.pop()
                while topToken != '(':
                    postfixList.append(topToken)
                    topToken = opStack.pop()
            else:
                while (not opStack.isEmpty()) and \
                   (prec[opStack.peek()] >= prec[token]):
                      postfixList.append(opStack.pop())
                opStack.push(token)

        while not opStack.isEmpty():
            postfixList.append(opStack.pop())
        return " ".join(postfixList)

So here is my fix to allow floats as well:

I added this function:

def isNumber(x):
    try:
        float(x)
        return True
    except ValueError:
        return False

And changed this line: if token in "0123456789": into this: if Stack.isNumber(token):

And now the code allows floats.

So what is the other problem? Well the other problem is that my code assumes the input string will have exactly one space in between each of the characters, hence I string.split() to put all the characters in the list. Except the input string can have an arbitrary amount of spaces in between the characters, and if there are no spaces, my code will compare something like "((" to my list of characters and not find it and throw a Key error. So since I have to allow a negative numbers (noted by a minus sign). How can I modify my code so that it will no longer throw the keyerror and allow me to have negative numbers?

When I do this:

print(Stack.infixToPostfix("( ( 1 + 3 ) ) * 4 - ( 9.2 - 0 ) * ( 5 + 8 )"))

My code outputs this: 1 3 + 4 * 9.2 0 - 5 8 + * -

Which works perfectly, however if I remove one space:

"(( 1 + 3 ) ) * 4 - ( 9.2 - 0 ) * ( 5 + 8 )"

My code no longer works. Key error '((' I know why it throws this error (explanation above) but I'm not sure how to fix it.

So final question TL:DR

How to modify my infixtopostfix code to allow for an arbitrary amount of spaces between the characters and allow for negative numbers?

So your code will move the negative sign even though it's supposed to remain in front of the number after the conversion correct? (Assuming the one space in between otherwise it will throw that key error). This is difficult, I suggest rewriting your entire code. — , Mar 12 '20 at 14:31
It just needs to parse the text (tokenize). Naively spliting on spaces is a good starting point to implement the state engine, but it places way too many restrictions on the input data. That's good programming, though, worry about one task at a time. — Kenny Ostrom, Mar 12 '20 at 16:18
Variable and function names should follow the `lower_case_with_underscores` style. — AMC, Mar 12 '20 at 19:42

Alex Sveshnikov · Accepted Answer · 2020-03-13T06:11:53.157

First create a separate function which will produce a list of tokens from your string. A token is number (without a sign) or a single character:

def tokenize(s):
    s = re.sub(r"\s+", "", s)
    result = []
    while (s):
        if s[0] in "0123456789":
            number = s[0]
            s = s[1:]
            while (s and s[0] in "0123456789."):
                number += s[0]
                s = s[1:]
            result.append(number)
            if s:
                result.append(s[0])
                s = s[1:]
        else:
            result.append(s[0])
            s = s[1:]
    return result

Then you'll need to keep track of unary plus and minus operations. To do this we introduce a special 'neg' operation - when you process this operation in postfix notation you just negate the value at the top of the operand stack.

You expect unary plus and minus operations at the start of the string or right after the opening '('. After you process a number operand or a closing ')' you reset the unary flag to False, because unary plus or minus cannot appear at these positions. When the unary flag is true you must keep tracking of incoming '+' and '-', use boolean flag 'neg' for it. Change 'neg' state at every '-'. When you finally find an operand - check the state of 'neg' flag. If it's True, then you need to put our special 'neg' operation after the operand. Putting a 'neg' operation after the closing ')' is a bit tricky and requires use of opStack.

def infixToPostfix(infixexpr):
        prec = {}
        prec["^"] = 3
        prec["*"] = 3
        prec["/"] = 3
        prec["+"] = 2
        prec["-"] = 2
        prec["("] = 1
        prec["neg"] = 1
        opStack = Stack()
        postfixList = []
        tokenList = tokenize(infixexpr)
        print(tokenList)

        unary = True
        neg = False

        for token in tokenList:
            if unary and token in "+-":
                if token == '-':
                     neg = not neg
            elif isNumber(token):
                postfixList.append(token)
                if neg:
                    postfixList.append("neg")
                    neg = False
                unary = False
            elif token == '(':
                if neg:
                    opStack.push("neg")
                    neg = False
                opStack.push(token)
                unary = True
            elif token == ')':
                topToken = opStack.pop()
                unary = False
                while topToken != '(':
                    postfixList.append(topToken)
                    topToken = opStack.pop()
                if not opStack.isEmpty() and opStack.peek() == "neg":
                    postfixList.append(opStack.pop())
            else:
                while (not opStack.isEmpty()) and \
                   (prec[opStack.peek()] >= prec[token]):
                      postfixList.append(opStack.pop())
                opStack.push(token)

        while not opStack.isEmpty():
            postfixList.append(opStack.pop())
        return " ".join(postfixList)

Input:

"-(( 1 + 3 ) ) * 4 - ( -9.2 - 0 ) * ( 5 + 8 ) - 4 * (-2)"

Output:

1 3 + neg 4 * 9.2 neg 0 - 5 8 + * - 4 2 neg * -

UPDATE 2020-03-12

If you want to process negative numbers as a single negative operand and not like a positive operand followed by a 'neg' operation, then you just need a very small modification of the infixToPostfix method. You only need to modify the elif isNumber(token) branch. I'll put it here complete though:

def infixToPostfix(infixexpr):
        prec = {}
        prec["^"] = 3
        prec["*"] = 3
        prec["/"] = 3
        prec["+"] = 2
        prec["-"] = 2
        prec["("] = 1
        prec["neg"] = 1
        opStack = Stack()
        postfixList = []
        tokenList = tokenize(infixexpr)

        unary = True
        neg = False

        for token in tokenList:
            if unary and token in "+-":
                if token == '-':
                     neg = not neg
            elif isNumber(token):
                if neg:
                    postfixList.append("-" + token)
                else:
                    postfixList.append(token)
                neg = False
                unary = False
            elif token == '(':
                if neg:
                    opStack.push("neg")
                    neg = False
                opStack.push(token)
                unary = True
            elif token == ')':
                topToken = opStack.pop()
                unary = False
                while topToken != '(':
                    postfixList.append(topToken)
                    topToken = opStack.pop()
                if not opStack.isEmpty() and opStack.peek() == "neg":
                    postfixList.append(opStack.pop())
            else:
                while (not opStack.isEmpty()) and \
                   (prec[opStack.peek()] >= prec[token]):
                      postfixList.append(opStack.pop())
                opStack.push(token)

        while not opStack.isEmpty():
            postfixList.append(opStack.pop())
        return " ".join(postfixList)

Now the output is

1 3 + neg 4 * -9.2 0 - 5 8 + * - 4 -2 * -

UPDATE 2020-03-13

In the original post I put the following sentence:

You expect unary plus and minus operations at the start of the string or right after the opening '('.

The code there and in the previous update also reflects this. I knew that it is technically not completely correct. The unary operation can be expected as well after an operation. However, I did not want to allow expressions like 2+--+-+3, so I excluded possibility for unary operations after an operation. Unfortunately, it also excludes possibility for 2^-3. If you want to be able to parse expressions like 2^-3, then you just need to allow the unary operation after another operation, it requires adding of a single line unary = True in the else branch:

            else:
                while (not opStack.isEmpty()) and \
                   (prec[opStack.peek()] >= prec[token]):
                      postfixList.append(opStack.pop())
                opStack.push(token)
                unary = True   # This is the only new line

Now you can parse 2^-3 as 2^(-3). However, it also allows parsing of 2+-3 as 2+(-3). I always found the last possibility very ugly in computer languages, but if it's ok for you - fine. Of course, you can also allow parsing of unary operation only after ^, but not after other operations. This will require checking of the current token, and setting unary to True only if the token is in the list of operations allowing unary minus after it.

I see that you are also handling unary minus as an operator. Solid addition. +1 (see also https://stackoverflow.com/questions/17254080/infix-to-postfix-algorithm-that-takes-care-of-unary-operators) — Kenny Ostrom, Mar 12 '20 at 16:22
How do you process 4*(-2)? You push 4, push 2, then negate 2 to get -2, then multiply. So that's why it is 4, 2, negate, *. — Alex Sveshnikov, Mar 12 '20 at 19:16
Of course you can process it as 4, -2, *. However in this implementation all number operands are positive (well, non-negative). So, you cannot have a single operand '-2'. You need to interpret it as +2, then negation. The negation operation is anyway required to process unary minus in front of non-number operands, i.e. in front of parenthesis. For example, at the start of output you see 1, 3, +, neg. It's the result of processing of '-((1+3))'. Because neg operation is required in this case I decided to make all number operands positive and process negative numbers as a number followed by neg. — Alex Sveshnikov, Mar 12 '20 at 19:27

Kenny Ostrom · Answer 2 · 2020-03-12T16:53:48.627

You could simply test for integer or float numbers with a try-except, and this will handle negative numbers, too. The problem is that splitting on spaces is a lot less flexible and reliable than actually parsing the tokens, and it places a huge burden on whoever uses the function.

You want a tokenizer function. Fortunately, python has a tokenizer module, although it's not all that easy to jump into the first time. Or you can write your own.

Here's a quick implementation using the library

from io import StringIO
from tokenize import generate_tokens, NUMBER, OP

def tokenizer(s):
    generator = generate_tokens(StringIO(s).readline)
    for toknum, tokval, _, _, _ in generator:
        if toknum in (NUMBER, OP):
            yield tokval

Just change your code to use

for token in tokenizer(infixexpr):

That fixes longer numbers and decimal numbers, and handles your test case with all spaces removed:

print (infixToPostfix("((1+3))*4-(9.2-0)*(5+8)"))
1 3 + 4 * 9.2 0 - 5 8 + * -

(I thought this was supposed to be a standalone function, not a class member. You might want to make it so by unindenting the function.)

Negative numbers will require a little more, because the tokenizer will immediately return "-" as an operator. You can write your own tokenizer function that reads -55 as one token, or you can keep track of the state and realize that if you're not expecting an operator, then a minus sign has to signal that the next token is a negative number. See How to differentiate '-' operator from a negative number for a tokenizer

A further problem beyond the issues you asked about is unary operators. If you allow a minus sign before an expression, then you have to handle it as an operator. Alex handled them in the other answer, and you can look at Infix to postfix algorithm that takes care of unary operators Some implementations print negative numbers as "(-5)" in postfix. Some use spaces, although it saves space if you don't have spaces -- it's not really human readable anyway.

Actually, you might want to use the tokenizer's type output in your class to help identify numbers and operators, instead of hiding it inside a tokenizer function like I did here. — Kenny Ostrom, Mar 12 '20 at 16:03
Hi thank you for you answer. This is way above my scope of knowledge of python. I will try and learn about it now. — , Mar 12 '20 at 19:16

Modify piece of code to allow floats and negative numbers as well as an arbitrary amount of spaces in between the characters in the inputted string

So here is my fix to allow floats as well:

So final question TL:DR

2 Answers2