C parser for mathematical expressions with one variable and save it into a function

Question

I know there are libs like boost.spirit or the expression parser from http://jamesgregson.blogspot.de/2012/06/mathematical-expression-parser-in-c.html and many others.

However, my expression is to be evaluated quite a lot of times inside a loop and it seems quite inefficient to run the parser every time.

Suppose my expression string is "exp(-0.5*(t-t_0)^2/(b^2))" (simple gaussian), where t_0 and b are constants and only t (=time) varies within the loop.

instead of calling the parser millions of times, I'd like to be able to save the expression somehow into a function accepting one argument (for the variable t) which then evaluates the expression.

Do you have an idea know how this can be done or if it can be done at all?

Unless I'm missing something, it sounds like you just need a function with one parametre, and that it makes the calculation when called and returns the result, pretty much as you described. It sounds like you have 2 constants and the formula stays the same. — AntonH, Feb 27 '17 at 16:15
If you know your expression string at the time of writing the program, convert it into a C function and let the compiler do the rest. — DYZ, Feb 27 '17 at 16:17
The user enters the function. It can be anything, but there's always one variable "t" — OD IUM, Feb 27 '17 at 16:18
@ODIUM So that mathematical formula might change, or is there a list of several available formulae? — AntonH, Feb 27 '17 at 16:20
Yes, the formula might change, this is why I need the parser in the first place — OD IUM, Feb 27 '17 at 16:20
Then, if the formula might change from one time to the next, I don't see any way to evaluate without calling some parser (whether from an external library, or one you make/made). Maybe an exception, you can keep the last to see if the formula changes from one time to the next, and call the parser only if the formula changes. But oitherwise, I don't see how you could get around it. — AntonH, Feb 27 '17 at 16:24
You could conceivably use the parser to create the C source for the function, compile it into a shared library and use dynamic linking to load it for the duration required...but you'd have to be *really* certain that there is nothing in the generated code that'd cause an security vulnerability. The suggested answers are a vastly safer/more sensible route though. — Chris Turner, Feb 27 '17 at 16:29
Yes, you pretty much need a parser. See my answer on how to write recursive descent parsers. These are easy, work great for expressions, and can easily be modified (the answer shows how) to build an AST and evaluate it. See http://stackoverflow.com/questions/2245962/is-there-an-alternative-for-flex-bison-that-is-usable-on-8-bit-embedded-systems/2336769#2336769 — Ira Baxter, Feb 27 '17 at 18:21
If your platform is x86, you could also use Tiny C Compiler as a library — Antti Haapala -- Слава Україні, Feb 27 '17 at 18:51

score 7 · Accepted Answer · answered Feb 27 '17 at 16:25

What you want to achieve is totally feasible. You can for instance build an AST (https://en.wikipedia.org/wiki/Abstract_syntax_tree) from the expression, where some nodes of the tree represent the variables.

Then, the evaluation of the expression for some values of the variables corresponds to the evaluation of that tree.

Actually, most parsers will produce such a tree internally and you may be able to find some library producing an AST for your need or you might want to code it yourself (see also https://en.wikipedia.org/wiki/Shunting-yard_algorithm). Here is a simple example of an expression parser producing an AST (in Java though).

Ajay Brahmakshatriya · Answer 2 · 2017-02-27T18:02:04.307

I totally agree with all the answers mentioned above about creating an AST and evaluating it at every iteration. That is by far the best (and the most correct) way of doing what you want.

But I write compilers for a living, and I can't help but suggest another interesting solution. I raised a brow when you said you want to create a "function" that accepts 1 argument and returns the result.

Let us try to do exactly that.

We will start by allocating some memory for our "function".

For now let us assume that 4k bytes is enough. So we start by doing

void* my_func = malloc(4k);

Now this is a real function we need to make the region executable. This will depend on your OS and you will have to invoke the right system call. All you have to do is give execute permission to this page.

Now we parse the string which represents the expression. I am assuming the WIN 64 fastcall calling convention here. You can use your own. So the parameter t will be in %rcx and thr result will be returned in %rax.

Now let's take an example expression - t*2 + 5

So we will have the assembly -

imulq $2, %rcx, %rcx
addq $5, %rcx
movq %rcx, %rax
retq

Now we assemble this into the equivalent bytes in the my_func

So you will have some equivalent of -

strcpy((char*)my_func, "\x48\x6B\xC9\x02\x48\x83\xC1\x05\x48\x89\C8\C3\0");

Just instead of one string you will have the buffer built over the parsing. But you get the idea.

If more memory is required, you can allocate double the size and copy the contents.

Finally all you have to do is call your "function" inside the loop -

typedef int (*func_type)(int);
for(t=0; t<N; t++)
    s=(func_type)(my_func)(t);

Although this is the most impractical and hard to implement method, I guarantee you this will give you the best performance (assuming you generated effecient assembly).

This is a fun exercise not to be taken seriously. It would be nice to see a library doing this for simple expressions.

Also don't forget to free your memory and remove the execute flag.

EDIT: One semi optimum but easy to generate strategy would be to use the stack for all the operations. Basically once you build the AST after parsing you do pop the arguments for each node from the stack, do the calculate the result using registers and push back on the stack. Final value can be popped off into %rax. This strategy would still be efficient that any run time evaluation of AST. Saves you from all the burden of register allocation and instruction scheduling.

very nice, thx a lot...this approach depends on the user's system,right? — OD IUM, Feb 28 '17 at 13:41
Yes, you would have to generate target specific code. I would rather suggest invoke a compiler and copy the bytes since that is what you will have to internally do. — Ajay Brahmakshatriya, Feb 28 '17 at 13:49

score 2 · Answer 3 · answered Feb 27 '17 at 16:27

Convert the (string) expression to an abstract syntax tree and interpret that in every iteration instead of (also) parsing it each time. If that's not enough then find a suitable byte code representation and compile your abstract syntax tree to a list of byte codes and feed that to a (byte code) interpreter. If that still isn't enough performance then compile either your abstract syntax tree or your byte code to machine code, use a (platform dependent) way to let the operating system know that it's executable code and call it.

As you might have guessed the difficulty and complexity greatly increases with every step. But it's a good project to learn. For production code you could consider using llvm for the byte code to machine code compilation task.

C parser for mathematical expressions with one variable and save it into a function

3 Answers3