1

Say I have 50,000 lines of string that contain simple mathematical expression (only +,- operator involve e.g. 1+2+3+5). I know that it is handy to use eval() in Python to evaluate those strings. However, the program is not efficient enough. I ran the cProfile and found most of the bottleneck is from the eval function (around 2.5 secs in 50,000 lines case). I've tried to write my own evaluation parser but it perform even slower then the eval.

So, what I want to ask is if there are any way to fast evaluate mathematical expression strings or improve the performance of eval()? Third-party package cannot be used.

The original problem is like this We have a string of digit like 1234567 and we can insert +,-,or nothing between the digits like 1+23-4+56-7. So there will be 3^(digit-1) combinations for a given number string

What I implement in Python to calculate and generate string like the following

import itertools
def gen_Eq(op, num):
    temp = [None]*(2*len(num)-1)
    temp[::2] = num
    temp[1::2] = op
    string = ''.join(temp)
    return string

def cal_Eq(num_string):
    op_list = tuple(itertools.product(['+','-',''],repeat=len(num_string)-1))
    eq = list(map(gen_Eq,op_list,itertools.repeat(num_string,len(op_list))))
    print map(eval,eq)
Gel Beaner
  • 25
  • 6
  • 2
    can you post some sample data/code of what you've tried (even if it was very slow, there could be many reasons for poor performance)? – qwwqwwq May 24 '13 at 04:39
  • 1
    This link should answer the question: http://stackoverflow.com/questions/2371436/evaluating-a-mathematical-expression-in-a-string –  May 24 '13 at 05:07
  • @user1460217 I did saw the link you provide. The thing I really want to know the performance of eval function and possible way to improve that. – Gel Beaner May 24 '13 at 05:33
  • What? You said you tried eval() so you know it performs 'poorly', there is not way to speed it up. Calling eval() is overkill in this case, you should go with user1460217's suggestions, or if possible you could write the equations in a way that is easier to pares (like RPN) and write a parser for that, but really, the above mentioned suggestion is your best option. – Blubber May 24 '13 at 06:10
  • Come to think of it, you could try to read the whole file and transform it to expressions of the for x1 = ....; x2 = ...; where ... is your line of code, and parse that in one go. But the suggestions above is much cleaner. – Blubber May 24 '13 at 06:13
  • If I understood, you generate yourself all the expressions you evaluate. Try to come up with an algorithm that exploits partial similarities of the expressions. – Janne Karila May 24 '13 at 08:38
  • @Blubber I try to write my own RPN but it dosen't improve the speed to much. My guess is my implementation involves some list operation(for simulating stack) which compares to the eval function(written the parser in C) would give me too much performance benefit. – Gel Beaner May 24 '13 at 09:07
  • @JanneKarila Your answer below is a nice and neat. Fully take the advantage of the simple +,- operation into sum. Thank you for your answer. – Gel Beaner May 24 '13 at 09:11

1 Answers1

0

This approach is faster:

>>> import re
>>> split_numbers = re.compile(r'-?\d+').findall
>>> sum(int(x) for x in split_numbers('1+23-4+56-7'))
69

In my timings the sum expression takes 4.5 µs vs. 13 µs for eval('1+23-4+56-7')

Note, however, that it does not handle consecutive + and -, eg. 1-+2 or 1--2, or spaces: 1 - 2.

Janne Karila
  • 24,266
  • 6
  • 53
  • 94