1

I am profiling some C++ code with perf, and I see that __scalbnf and __wrap_scalbnf are taking up a good chunk of the run time. I looked up what these functions are, and my best guess is I am calling them via a call to std::exp. However I'd like to be able to confirm this. Is there a place where I can see the C++ code implementing std::exp to confirm this? Or what is the best way for me (a C++ amateur) to start digging into this and understanding what is happening?

Thank you.

TFdoe
  • 571
  • 5
  • 16
  • 2
    Well, if you are using libstdc++ or libc++ you can examine their code. You might also be able to step into the code with a debugger. – NathanOliver Jul 02 '18 at 15:40
  • add a break point to then in a debugger run until it breaks, look at the call stack to find the last known function – Tyker Jul 02 '18 at 15:40
  • clang libcxx source is here : https://github.com/llvm-mirror/libcxx however it looks like std::exp uses the c functions exp expf expl so reading the c++ code won't help... – Olivier Sohn Jul 02 '18 at 16:01

2 Answers2

2

Set a breakpoint on __scalbn. Run your program. Look at a backtrace (in GDB, bt). The call tree will show that exp() is a parent function for __scalbn.

If a function has multiple callers, the first hit might not be from the "hot" function you're profiling.

To actually figure out which higher-up function (including its children) is responsible for using a lot of time, see linux perf: how to interpret and find hotspots. Top-down profiling can find expensive functions that do all their work in calls to other functions, even when those other functions also have "innocent" callers. (e.g. memcpy is heavily used and often unavoidable, but what you'd want to find are callers that use it too much and could be optimized better. Or not called at all.)


And BTW, yes glibc's math lib exp() implementation does internally use __scalbn. I'm not sure how bad the implementation is, but I don't see an asm version for x86-64, only this pure C version. https://code.woboq.org/userspace/glibc/sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c.html. (For __scalbnl(long double) there's https://code.woboq.org/userspace/glibc/sysdeps/x86_64/fpu/s_scalbnl.S.html, using the x87 fscale instruction for 80-bit floats. But there are only i386 asm files for the other sizes. And IA-64 (Itanium), but not x86-64).

glibc does have some vectorized EXP code, though, like the SSE4 SVML version https://code.woboq.org/userspace/glibc/sysdeps/x86_64/fpu/multiarch/svml_d_exp2_core_sse4.S.html#_ZGVbN2v_exp_sse4.


If you want higher-performance exp() without perfect accuracy, see Fastest Implementation of Exponential Function Using AVX (that's for float, not double. I forget if there's an SO answer with a double version).

Also related: Efficient implementation of log2(__m256d) in AVX2.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
1

To confirm that std::exp is the reason for __scalbnf and __wrap_scalbnf, you can replace the std::exp calls by either:

  • an identity function that returns the input value
  • or by an alternative exp implementation (for example fm_exp, found here)

Then, if you still see __scalbnf and __wrap_scalbnf in the profiler output, it means it's not coming from std::exp.

Olivier Sohn
  • 1,292
  • 8
  • 18