Can this Python postfix notation (reverse polish notation) interpreter be made more efficient and accurate?

Question

Here is a Python postfix notation interpreter which utilizes a stack to evaluate the expressions. Is it possible to make this function more efficient and accurate?

#!/usr/bin/env python


import operator
import doctest


class Stack:
    """A stack is a collection, meaning that it is a data structure that 
    contains multiple elements.

    """

    def __init__(self):
        """Initialize a new empty stack."""
        self.items = []       

    def push(self, item):
        """Add a new item to the stack."""
        self.items.append(item)

    def pop(self):
        """Remove and return an item from the stack. The item 
        that is returned is always the last one that was added.

        """
        return self.items.pop()

    def is_empty(self):
        """Check whether the stack is empty."""
        return (self.items == [])


# Map supported arithmetic operators to their functions
ARITHMETIC_OPERATORS = {"+":"add", "-":"sub", "*":"mul", "/":"div", 
                        "%":"mod", "**":"pow", "//":"floordiv"}


def postfix(expression, stack=Stack(), operators=ARITHMETIC_OPERATORS):
    """Postfix is a mathematical notation wherein every operator follows all 
    of its operands. This function accepts a string as a postfix mathematical 
    notation and evaluates the expressions. 

    1. Starting at the beginning of the expression, get one term 
       (operator or operand) at a time.
       * If the term is an operand, push it on the stack.
       * If the term is an operator, pop two operands off the stack, 
         perform the operation on them, and push the result back on 
         the stack.

    2. When you get to the end of the expression, there should be exactly 
       one operand left on the stack. That operand is the result.

    See http://en.wikipedia.org/wiki/Reverse_Polish_notation

    >>> expression = "1 2 +"
    >>> postfix(expression)
    3
    >>> expression = "5 4 3 + *"
    >>> postfix(expression)
    35
    >>> expression = "3 4 5 * -"
    >>> postfix(expression)
    -17
    >>> expression = "5 1 2 + 4 * + 3 -"
    >>> postfix(expression, Stack(), ARITHMETIC_OPERATORS)
    14

    """
    if not isinstance(expression, str):
        return
    for val in expression.split(" "):
        if operators.has_key(val):
            method = getattr(operator, operators.get(val))
            # The user has not input sufficient values in the expression
            if len(stack.items) < 2:
                return
            first_out_one = stack.pop()
            first_out_two = stack.pop()
            operand = method(first_out_two, first_out_one)
            stack.push(operand)
        else:
            # Type check and force int
            try:
                operand = int(val)
                stack.push(operand)
            except ValueError:
                continue
    return stack.pop()


if __name__ == '__main__':
    doctest.testmod()

Pi Delport · Accepted Answer · 2010-10-05T18:24:19.663

10

General suggestions:

Avoid unnecessary type checks, and rely on default exception behavior.
has_key() has long been deprecated in favor of the in operator: use that instead.
Profile your program, before attempting any performance optimization. For a zero-effort profiling run of any given code, just run: python -m cProfile -s cumulative foo.py

Specific points:

list makes a good stack out of the box. In particular, it allows you to use slice notation (tutorial) to replace the pop/pop/append dance with a single step.
ARITHMETIC_OPERATORS can refer to operator implementations directly, without the getattr indirection.

Putting all this together:

ARITHMETIC_OPERATORS = {
    '+':  operator.add, '-':  operator.sub,
    '*':  operator.mul, '/':  operator.div, '%':  operator.mod,
    '**': operator.pow, '//': operator.floordiv,
}

def postfix(expression, operators=ARITHMETIC_OPERATORS):
    stack = []
    for val in expression.split():
        if val in operators:
            f = operators[val]
            stack[-2:] = [f(*stack[-2:])]
        else:
            stack.append(int(val))
    return stack.pop()

edited Oct 05 '10 at 18:24

answered Oct 05 '10 at 18:18

Pi Delport

10,356
3
36
50

2

Piet, this is a great answer. The program execution time has decreased. However, I do have a question. In `postfix`, would you catch all possible exceptions and bubble them up to the caller as a custom `Exception`, or would you wrap the call to `postfix` in a comprehensive try/except block? – simeonwillbanks Oct 05 '10 at 19:32
simeonwillbanks: When in doubt about exceptions, do nothing. :-) They're exceptional for a reason! It's rare to define your own exceptions: don't do it when a [built-in exception](http://docs.python.org/library/exceptions.html) would suffice. Also, don't feel compelled to write "comprehensive" try/except blocks: almost all code should let almost all exceptions pass, and handle only exceptions that they *know* they should handle. (One of the only places you'd ever want a catch-all exception handler is at the top level of a production application, for example, to report the error.) – Pi Delport Oct 05 '10 at 19:51
Piet, thanks for the tips. I'm accepting your solution! P.S. This past summer, I visited Cape Town for the 2010 World Cup. It is a lovely city, and I hope to return one day. – simeonwillbanks Oct 05 '10 at 21:09
This is a great answer. Could you explain the line stack[-2:] = [f(*stack[-2:])] though please. I can make out it calls the relevant method for the operator but beyond that I haven't got a 'Scooby'. – Caltor Sep 28 '13 at 01:22
@Caltor: That is a slice access and slice assignment: `stack[-2:]` represents the top two items operands on the stack. You can break it down as follows: `[a, b] = stack[-2:]; result = f(a, b); stack[-2:] = [r]` – Pi Delport Sep 29 '13 at 04:40
(Correction: That last assignment should read `stack[-2:] = [result]`, of course. Assigning to a two-item slice like that will replace both items with `result`, effectively the same as if you popped them, and then pushed `result`.) – Pi Delport Sep 29 '13 at 04:55
@PietDelport So is it roughly equivalent to `a = stack.pop(); b = stack.pop(); result = f(a, b); stack.append(result)` ? – Caltor Sep 30 '13 at 10:12
@Caltor: Yes; just the order of `a` and `b` differs with those pops. – Pi Delport Oct 02 '13 at 08:14
@PietDelport Ah yes sorry, I had `a` and `b` the wrong way round. My excuse is I still had `b operator a` in my head from messing with infix. :| – Caltor Oct 03 '13 at 10:00
1

@PietDelport Think I finally got my head around `stack[-2:] = [f(*stack[-2:])]` thanks to SO. The `-2:` accesses the last two elements in the list. The `*` unpacks/dereferences the slice of the list into arguments that can be passed into the function referenced by `f()`. I'm guessing the surrounding square brackets `[]` turns the result into a single item list so it can be assigned back into the list/stack. I'm going to stop hijacking this question now. – Caltor Oct 03 '13 at 10:21

nmichaels · Answer 2 · 2010-10-05T17:21:22.220

Lists can be used directly as stacks:

>>> stack = []
>>> stack.append(3) # push
>>> stack.append(2)
>>> stack
[3, 2]
>>> stack.pop() # pop
2
>>> stack
[3]

You can also put the operator functions directly into your ARITHMETIC_OPERATORS dict:

ARITHMETIC_OPERATORS = {"+":operator.add,
                        "-":operator.sub,
                        "*":operator.mul,
                        "/":operator.div, 
                        "%":operator.mod,
                        "**":operator.pow,
                        "//":operator.floordiv}

then

if operators.has_key(val):
    method = operators[val]

The goal of these is not to make things more efficient (though it may have that effect) but to make them more obvious to the reader. Get rid of unnecessary levels of indirection and wrappers. That will tend to make your code less obfuscated. It will also provide (trivial) improvements in performance, but don't believe that unless you measure it.

Incidentally, using lists as stacks is fairly common idiomatic Python.

Excellent point. Replacing the `Stack` abstract data type with a built-in `list` should make the `postfix` function faster. On my Mac (OS X 10.5.8/2.6 GHz Intel Core 2 Duo/4GB Mem), the built-in stack either runs in 0.18 seconds or 0.19 seconds. Consistently, the Stack ADT runs in 0.19 seconds. — simeonwillbanks, Oct 05 '10 at 17:23

score 2 · Answer 3 · 2010-10-05T19:18:12.587

2

You can directly map the operators: {"+": operator.add, "-": operator.sub, ...}. This is simpler, doesn't need the unnecessary getattr and also allows adding additional functions (without hacking the operator module). You could also drop a few temporary variables that are only used once anyway:

rhs, lhs = stack.pop(), stack.pop()
stack.push(operators[val](lhs, rhs)).

Also (less of a performance and more of a style issue, also subjective), I would propably don't do error handling at all in the loop and wrap it in one try block with an except KeyError block ("Unknown operand"), an except IndexError block (empty stack), ...

But accurate? Does it give wrong results?

edited Oct 05 '10 at 19:18

answered Oct 05 '10 at 17:27

Seconded on the error handling thing. Python exceptions are not C++ exceptions. – nmichaels Oct 05 '10 at 17:37
Thanks for the suggestions. All the doctests are passing, but they might not be comprehensive. Maybe there is still a bug? – simeonwillbanks Oct 05 '10 at 17:37
2

`stack.push(operators[val](stack.pop(), stack.pop()))` is buggy because the operator func expects the first out operand as the second argument. Originally, I coded this line: `method(stack.pop(), stack.pop())`, but the doctests were failing. – simeonwillbanks Oct 05 '10 at 17:40
@simeon: Sorry, scratch the `.pop(), .pop()` thing - fixed it. – Oct 05 '10 at 19:21

Can this Python postfix notation (reverse polish notation) interpreter be made more efficient and accurate?

3 Answers3

Linked