6

I need a function which takes one of python's operator symbols or keywords as a string, along with its operands, evaluates it, and returns the result. Like this:

>>> string_op('<=', 3, 3)
True
>>> string_op('|', 3, 5)
7
>>> string_op('and', 3, 5)
True
>>> string_op('+', 5, 7)
12
>>> string_op('-', -4)
4

The string cannot be assumed to be safe. I will be satisfied with just mapping the binary operators, but I'd be extra happy if I could get all of them.

My current implementation manually maps the symbols to the functions in the operator module:

import operator

def string_op(op, *args, **kwargs):
    """http://docs.python.org/2/library/operator.html"""
    symbol_name_map = {
        '<': 'lt',
        '<=': 'le',
        '==': 'eq',
        '!=': 'ne',
        '>=': 'ge',
        '>': 'gt',
        'not': 'not_',
        'is': 'is_',
        'is not': 'is_not',
        '+': 'add', # conflict with concat
        '&': 'and_', # (bitwise)
        '/': 'div',
        '//': 'floordiv',
        '~': 'invert',
        '%': 'mod',
        '*': 'mul',
        '|': 'or_', # (bitwise)
        'pos': 'pos_',
        '**': 'pow',
        '-': 'sub', # conflicts with neg
        '^': 'xor',
        'in': 'contains',
        '+=': 'iadd', # conflict with iconcat
        '&=': 'iand',
        '/=': 'idiv',
        '//=': 'ifloordiv',
        '<<=': 'ilshift',
        '%=': 'imod',
        '*=': 'imul',
        '|=': 'ior',
        '**=': 'ipow',
        '>>=': 'irshift',
        '-=': 'isub',
        '^=': 'ixor',
    }
    if op in symbol_name_map:
        return getattr(operator, symbol_name_map[op])(*args, **kwargs)
    else:
        return getattr(operator, op)(*args, **kwargs)

This solution fails on the overloaded operators -- add/concat and sub/neg. Checks could be added to detect those cases and detect types or count arguments to pick the right function name, but that feels a bit ugly. It's what I'll go with if I don't get a better idea here.

The thing that is bugging me is that python already does this. It already knows how to map symbols to operator functions, but so far as I can tell, that functionality is not exposed to the programmer. Seems like everything else in python, right down to the pickling protocol, is exposed to programmers. So where is this? or why isn't it?

Phil
  • 869
  • 9
  • 19
  • 5
    I'm pretty sure you don't actually need to worry about `add` vs. `concat` unless you're dealing with the C API (or using some funky C extension type that implements both slots and does different things or them). In other words, unless I'm mistaken, from Python, `operator.add(seq1, seq2)` should work, and call the concat slot if there is no add slot, so you're fine just using `add`. – abarnert Feb 04 '13 at 21:17
  • 1
    Also, Python doesn't actually map symbols to `operator` functions; it maps symbols to dunder methods like `__add__` (and actually, even that isn't quite accurate, because of the C extension slots). The `operator` module is just a bunch of functions that happen to call the same dunder methods. – abarnert Feb 04 '13 at 21:21
  • Cool! You're right, adding sequences does work with my solution. Not as much luck with negation/subtraction though. Yeah, I didn't mean python did the same thing as my code, I meant it does what I'm trying to do. It would feel right if I could hook into that system instead of doing it myself. – Phil Feb 04 '13 at 21:26
  • @Phil: Negation is a unary operator, subtraction is a binary operator. `-1` is negative 1, `1-1` is subtraction. So count the operands and you should be fine! – Martijn Pieters Feb 04 '13 at 21:29
  • Would [pyparsing](http://pyparsing.wikispaces.com/Examples) help an [example](http://pyparsing.wikispaces.com/file/view/fourFn.py/30154950/fourFn.py) which has some of what you want although it would mean you having code like ``string_op("3 <= 3")`` which may definitely not be what you want. – sotapme Feb 04 '13 at 21:32
  • @MartijnPieters: Actually, I think `-1` is not the unary operator `-` applied to the number `1`, it's just the number `-1`. Try `ast.parse('-1').body[0].value.n`. – abarnert Feb 04 '13 at 21:34
  • @abarnert: pick, pick, pick... `-(1)` then.. :-) – Martijn Pieters Feb 04 '13 at 21:35
  • @MartijnPieters as I wrote in the question, I can make my example work by counting operators, but I'd like a cleaner solution than what I have already, not a dirtier one. For my purposes I could probably use `*` and `-1` when I need to negate, and then I have all of them except that one. – Phil Feb 04 '13 at 21:38
  • @Phil: I think you actually want to have separate mappings for unary and binary functions. So, `string_op('-', 4)` calls `operator.neg(4)`, while `string_op('-', 4, 0)` calls `operator.sub(4, 0)`. – abarnert Feb 04 '13 at 21:40
  • 1
    @Phil: Except that `-somevariable` can mean different things for different types; objects can implement [hooks for unary operators](http://docs.python.org/2/reference/datamodel.html#object.__neg__). `-somevariable` is *not* the same thing as `somevariable * -1`. – Martijn Pieters Feb 04 '13 at 21:40
  • @MartijnPieters Very true. What I meant was just not supporting unary `-` at all (it's still available via `neg`). But I'm starting to come around to your suggestion of counting arguments, and @abarnert's suggestion to split the maps up into unary/binary operators seems like a fairly clean route for that. – Phil Feb 04 '13 at 21:55
  • What are you trying to do? I've got a suspicion this isn't the best way to go about it. – Winston Ewert Feb 04 '13 at 21:58
  • @WinstonEwert I was waiting for that comment :). Declarative user-configured data processing. This particular routine will apply an operation to all values in a column of data. So the config options available to the user are: column, operator, and (optionally) value. These are values in a yaml file. – Phil Feb 04 '13 at 22:12
  • @Phil: It probably would have been better to head off that comment by explaining it in the question. You got away with it this time because the question was interesting enough that everyone wanted to chip in anyway… but in general, you're asking for downvotes or closes… – abarnert Feb 04 '13 at 22:16
  • @abarnert yeah... I left it out because I wanted an answer to the question more than a solution to my problem. People on stack are so pragmatic! – Phil Feb 04 '13 at 22:42

4 Answers4

6

Python does not map symbols to operator functions. It interprets symbols by calling special dunder methods.

For example, when you write 2 * 3, it doesn't call mul(2, 3); it calls some C code that figures out whether to use two.__mul__, three.__rmul__, or the C-type equivalents (the slots nb_multiply and sq_repeat are both equivalent to both __mul__ and __rmul__). You can call that same code from a C extension module as PyNumber_Multiply(two, three). If you look at the source to operator.mul, it's a completely separate function that calls the same PyNumber_Multiply.

So, there is no mapping from * to operator.mul for Python to expose.

If you want to do this programmatically, the best I can think of is to parse the docstrings of the operator functions (or, maybe, the operator.c source). For example:

runary = re.compile(r'Same as (.+)a')
rbinary = re.compile(r'Same as a (.+) b')
unary_ops, binary_ops = {}, {}
funcnames = dir(operator)
for funcname in funcnames:
    if (not funcname.startswith('_') and
        not (funcname.startswith('r') and funcname[1:] in funcnames) and
        not (funcname.startswith('i') and funcname[1:] in funcnames)):
        func = getattr(operator, funcname)
        doc = func.__doc__
        m = runary.search(doc)
        if m:
            unary_ops[m.group(1)] = func
        m = rbinary.search(doc)
        if m:
            binary_ops[m.group(1)] = func

I don't think this misses anything, but it definitely has some false positive, like "a + b, for a " as an operator that maps to operator.concat and callable( as an operator that maps to operator.isCallable. (The exact set depends on your Python version.) Feel free to tweak the regexes, blacklist such methods, etc. to taste.

However, if you really want to write a parser, you're probably better off writing a parser for your actual language than writing a parser for the docstrings to generate your language parser…

If the language you're trying to parse is a subset of Python, Python does expose the internals to help you there. See the ast module for the starting point. You might still be happier with something like pyparsing, but you should at least play with ast. For example:

sentinel = object()
def string_op(op, arg1, arg2=sentinel):
    s = '{} {}'.format(op, arg1) if arg2 is sentinel else '{} {} {}'.format(op, arg1, arg2)
    a = ast.parse(s).body

Print out a (or, better, ast.dump(a)), play with it, etc. You'll still need to map from _ast.Add to operator.add, however. But if you want to instead map to an actual Python code object… well, the code for that is available too.

abarnert
  • 354,177
  • 51
  • 601
  • 671
3

you can use a crude regex. we can do:

import re, operator

def get_symbol(op):
    sym = re.sub(r'.*\w\s?(\S+)\s?\w.*','\\1',getattr(operator,op).__doc__)
    if re.match('^\\W+$',sym):return sym

Examples:

 get_symbol('matmul')
'@'
get_symbol('add')
 '+'
get_symbol('eq')
'=='
get_symbol('le')
'<='
get_symbol('mod')
'%'
get_symbol('inv')
'~'
get_symbol('ne')
'!='

Just to mention a few. You could also do:

{get_symbol(i):i for i in operator.__all__} 

This gives you a dictionary with the symbols. You will see that somethings like abs gives gives incorrect since there is no symbolic version implemented

Onyambu
  • 67,392
  • 3
  • 24
  • 53
2

If you're going to use such a map, why not map directly to functions instead of having a layer of indirection by name? For example:

symbol_func_map = {
    '<': (lambda x, y: x < y),
    '<=': (lambda x, y: x <= y),
    '==': (lambda x, y: x == y),
    #...
}

While this wouldn't be any more concise than your current implementation, it should get the correct behaviour in the majority of cases. The remaining problems are where a unary and a binary operator conflict, and those could be addressed by adding arity to the dictionary keys:

symbol_func_map = {
    ('<', 2): (lambda x, y: x < y),
    ('<=', 2): (lambda x, y: x <= y),
    ('==', 2): (lambda x, y: x == y),
    ('-', 2): (lambda x, y: x - y),
    ('-', 1): (lambda x: -x),
    #...
}
Weeble
  • 17,058
  • 3
  • 60
  • 75
  • You know, I started with almost exactly this, but got tired of typing `lambda x, y:` a million times. But now that you bring it up again, I really like that it uses the symbols it's mapping directly, instead of creating its own map to python operator functions. – Phil Feb 04 '13 at 22:46
  • 2
    @Phil: You could always generate this code programmatically (either at runtime, or with a code generator step that's part of your "build" process) instead of typing it yourself. In fact, I think I'd do it that way to avoid the chance of errors. (You could even combine the two answers, and walk the list of operators from `ast` or `operator` and and use that to generate the code.) – abarnert Feb 04 '13 at 22:48
  • 1
    I would use the functions supplied by the `operator` module, as they avoid the overhead of defining a function that just wraps the underlying functions. – chepner Apr 22 '22 at 14:29
-1

You could use eval to generate lambda functions for the operators instead of using the operator module. Eval is generally bad practice, but I think for this purpose it's fine because it's nothing really crazy.

def make_binary_op(symbol):
    return eval('lambda x, y: x {0} y'.format(symbol))

operators = {}
for operator in '+ - * / ^ % (etc...)'.split(' '):
    operators[operator] = make_binary_op(operator)

operators['*'](3, 5) # == 15
vgel
  • 3,225
  • 1
  • 21
  • 35
  • Cool approach, very concise. Still needs a hard-coded list of all valid operators though, which is what I was trying to avoid. – Phil Feb 21 '14 at 18:54
  • Nice use of `eval()`; _as long as something else doesn't call `make_binary_op()` incorrectly_, there are no security issues since you're only passing in characters you provide. – Cyphase Aug 15 '15 at 00:10
  • You can replace `for operator in '+ - * / ^ %'.split(' '):` with `for operator in '+-*/^%':`. – Cyphase Aug 15 '15 at 00:11
  • @Cyphase not really, because some operators have more than one character, e.g. `//` or `>=`. – frnhr Jul 16 '21 at 19:39
  • **[Do not ever use `eval` (or `exec`) on data that could possibly come from outside the program in any form. It is a critical security risk. You allow the author of the data to run arbitrary code on your computer](https://stackoverflow.com/questions/1832940/why-is-using-eval-a-bad-practice). It [cannot easily be sandboxed, and proper sandboxing is harder than using a proper tool for the job.](https://stackoverflow.com/questions/3068139)** – Karl Knechtel Mar 29 '23 at 01:10