24

In my code, I'm using eval to evaluate a string expression given by the user. Is there a way to compile or otherwise speed up this statement?

import math
import random

result_count = 100000
expression = "math.sin(v['x']) * v['y']"

variable = dict()
variable['x'] = [random.random() for _ in xrange(result_count)]
variable['y'] = [random.random() for _ in xrange(result_count)]

# optimize anything below this line

result = [0] * result_count

print 'Evaluating %d instances of the given expression:' % result_count
print expression

v = dict()
for index in xrange(result_count):
    for name in variable.keys():
        v[name] = variable[name][index]
    result[index] = eval(expression) # <-- option ONE
    #result[index] = math.sin(v['x']) * v['y'] # <-- option TWO

For a quick comparison option ONE takes 2.019 seconds on my machine, while option TWO takes only 0.218 seconds. Surely Python has a way of doing this without hard-coding the expression.

devtk
  • 1,999
  • 2
  • 18
  • 24
  • 4
    Check out some alternatives to eval in this post http://stackoverflow.com/questions/1832940 as well as some good reasons to stay away from it. – Paul Sasik Sep 17 '12 at 21:53
  • 2
    what if the user types `import os;os.system("rm -rf /")`? You need to write a parser to interpret the input string, and only recognise what you expect: `sin`, `cos`, `log`, etc. Throw an error if what they enter doesn't work. It could be bad if you don't do that. – John Lyon Sep 17 '12 at 22:05
  • 3
    If the user wants to "rm -rf /" or ":(){ :|: & };:" he can do it in a shell instead of within Python. – devtk Sep 17 '12 at 22:15

3 Answers3

42

You can also trick python:

expression = "math.sin(v['x']) * v['y']"
exp_as_func = eval('lambda: ' + expression)

And then use it like so:

exp_as_func()

Speed test:

In [17]: %timeit eval(expression)
10000 loops, best of 3: 25.8 us per loop

In [18]: %timeit exp_as_func()
1000000 loops, best of 3: 541 ns per loop

As a side note, if v is not a global, you can create the lambda like this:

exp_as_func = eval('lambda v: ' + expression)

and call it:

exp_as_func(my_v)
Ohad
  • 2,752
  • 17
  • 15
  • 3
    This is a noticeable speed improvement over F.J.'s response, which was already a big speed improvement. – devtk Sep 17 '12 at 23:03
  • I guess this trick is equivalent to use `compile` before eval because when you run it you get `The slowest run took 17.90 times longer than the fastest. This could mean that an intermediate result is being cached`. – Mermoz May 25 '16 at 15:20
  • 5
    Who is "F. J."? – Justas Jun 27 '19 at 00:44
21

You can avoid the overhead by compiling the expression in advance using compiler.compile() for Python 2 or compile() for Python 3 :

In [1]: import math, compiler

In [2]: v = {'x': 2, 'y': 4}

In [3]: expression = "math.sin(v['x']) * v['y']"

In [4]: %timeit eval(expression)
10000 loops, best of 3: 19.5 us per loop

In [5]: compiled = compiler.compile(expression, '<string>', 'eval')

In [6]: %timeit eval(compiled)
1000000 loops, best of 3: 823 ns per loop

Just make sure you do the compiling only once (outside of the loop). As mentioned in comments, when using eval on user submitted strings make sure you are very careful about what you accept.

Stéphane
  • 2,068
  • 22
  • 28
Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
4

I think you are optimising the wrong end. If you want to perform the same operation for a lot of numbers you should consider using numpy:

import numpy
import time
import math
import random

result_count = 100000
expression = "sin(x) * y"

namespace = dict(
    x=numpy.array(
        [random.random() for _ in xrange(result_count)]),
    y=numpy.array(
        [random.random() for _ in xrange(result_count)]),
    sin=numpy.sin,
)
print ('Evaluating %d instances '
       'of the given expression:') % result_count
print expression

start = time.time()
result = eval(expression, namespace)
numpy_time = time.time() - start
print "With numpy:", numpy_time


assert len(result) == result_count
assert all(math.sin(a) * b == c for a, b, c in
           zip(namespace["x"], namespace["y"], result))

To give you an idea about the possible gain I've added a variant using generic python and the lambda trick:

from math import sin
from itertools import izip

start = time.time()
f = eval("lambda: " + expression)
result = [f() for x, y in izip(namespace["x"], namespace["y"])]
generic_time = time.time() - start
print "Generic python:", generic_time
print "Ratio:", (generic_time / numpy_time)

Here are the results on my aging machine:

$ python speedup_eval.py 
Evaluating 100000 instances of the given expression:
sin(x) * y
With numpy: 0.006098985672
Generic python: 0.270224094391
Ratio: 44.3063992807

The speed-up is not as high as I expected, but still significant.

Peter Otten
  • 414
  • 2
  • 3
  • I don't have access to `numpy` here. But I agree, it might speed things up. I'm generally against relying on a third party library if I can get by without it. – devtk Sep 18 '12 at 16:51