Call them in a loop to test throughput or latency, depending on whether the input to the next call depends on the output of the previous or not. Probably with data from a small to medium sized array of random values for the throughput test.
You want your compiler to produce an asm loop that does minimal work beyond the function call, so use appropriate techniques for whatever language and compiler you choose. (Idiomatic way of performance evaluation?)
You might disassemble or single-step through their execution to look for data-dependent branching to figure out which ranges of inputs might be faster or slower. (Or for open-source math libraries like glibc, the commented source could be good to look at.)