In a C++ CPU-bound simulation that I'm writing, I've traced a bottleneck via valgrind in my program to cmath::exp
. It currently eats over 40% of my simulation time. I can bound the input to a relatively small domain, but I'd like to control the accuracy. I was considering moving to a LUT (lookup table) to replace exp
but I'm not quite sure how to do this the "right way"(tm). Concerns I have:
- Large lookup tables will not fit in the cache thus slowing access
- Best way to convert a double input to an integer for access into the lookup table
- Does the answer to (2) depend on the slope of the input function?
- Am I reinventing the wheel - has this already been done before?
What is the best way to implement/(include from a library) a LUT for exp
?