0

Following a profiling of my C++ code, it appears that the pow function is used a lot.

Some of my pow functions have an integer exponent and another non-integer exponent. I am only interested for the ones with integer exponent.

To gain in performance, I am looking a way to define a macro like this:

#define pow(x,n) ({\
    double product;\
    if (typeid(n).name() == "i") {\
    for(int i = 0; i < n-1; i++)\
        product *= x;}\
    else\
    product = pow(x,n);\
    product;\
})

But I don't get the gain expected regarding the runtime. I think this is due to the else part in my macro where I call the classical pow function.

How can I determine in advance the type of exponent before macro was "written" during the pre-processing?

Ideally, I would like this macro only to be applied if the exponent is an integer, but it seems my attempt is not pertinent.

From your suggestions, I tried three options:

First option: Just add overload inline functions with base which is integer or double:

// First option
inline int pow(int x, int n){
    // using simple mechanism for repeated multiplication
    int product = 1;
    for(int i = 0; i < n; i++){
        product *= x;
    }
    return product;
}

inline int pow(double x, int n){
    // using simple mechanism for repeated multiplication
    double product = 1;
    for(int i = 0; i < n; i++){
        product *= x;
    }
    return product;
}

Result: runtime = 1 min 08 sec

Second option: Define a macro that calls via inline my_pow function if the exponent n is not an integer:

// Second option
inline double my_pow(double x, double n){
    return pow(x,n);
}

#define pow(x,n) ({\
    double product = 1;\
    if (typeid(n) == typeid(int)) {\
    for(int i = 0; i < n; i++)\
        product *= x;}\
    else product = my_pow(x,n);\
    product;\
})

Result: runtime = 51.86 sec

Third option: suggestion given in answer with template<typename type_t>

template<typename type_t>
inline double pow(const double x, type_t n)
{
    // This is compile time checking of types.
    // Don't use the RTTI thing you are trying to do
    //if constexpr (is_floating_point_v<type_t>)
    if (typeid(n) != typeid(int))
    {
        return pow(x, n);
    }
    else
    {
        double value = 1;
        for (type_t i = 0; i < n; ++i) value *= x;
        return value;
    }
}

Result: runtime = 52.84 sec

So finally, from these first tests, the best option would be the second one where I use a macro combined with a function that calls the general case of the pow function (both integer and floating exponent).

Is there a more efficient solution or is the second option the best?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
  • 7
    Avoid defining unnecessary macros such as this. Use a function; it's superior in every way. – eerorika Oct 21 '21 at 11:30
  • 7
    overload would do the job here. – Jarod42 Oct 21 '21 at 11:31
  • What kind of gain were you expecting, and what did you get? – molbdnilo Oct 21 '21 at 12:01
  • `type_info::name` returns a `const char*` that will most likely not compare equal to `"i"` (and the name is non-portable and not guaranteed to be unique). A more robust comparison is `typeid(n) == typeid(int)`. (Not that it's useful for you case.) – molbdnilo Oct 21 '21 at 12:04
  • The `pow` algorithm implemented is not really the best. Using `pow(x, n)=pow(X*x,n/2)` for even n can reduce the number of multiplications from O(n) to O(log N). – MSalters Oct 21 '21 at 12:39
  • Using improper tool will create a problem instead of solution. Macro in C++ should be only used for things that cannot be done otherwise such as conditional compilation. This task can be easily done with template and or function overloading, macro is not necessary here and will not create neither more readable nor more effective code. – Slava Oct 21 '21 at 12:46
  • You should be using [type_traits](https://en.cppreference.com/w/cpp/header/type_traits). not macros. – PaulMcKenzie Oct 21 '21 at 12:56
  • 4
    Chill out and use `std::pow` which has overloads for integral arguments. – Bathsheba Oct 21 '21 at 13:08
  • Thanks for your suggestions, could you take a look please at my **UPDATE** where I tried 3 options and the runtimes obtained for each of them. –  Oct 21 '21 at 21:00
  • `y= pow(x,n)` can be implemented as `y= std::exp(n * std::log(n))`. Assuming each of exp/log needs 10 CPU cicles, I don't think your loop-multiplication can be faster for some `n`, say n>100. – Ripi2 Oct 23 '21 at 23:18
  • @youpilat13: did you even try to implement my solution? As I mentioned, the performance is O(n) instead of O(log n). I hope you're not spooked by the downvotes, given by some people? – Dominique Oct 24 '21 at 14:41
  • 1
    Are the integer exponents ever negative? – 1201ProgramAlarm Oct 24 '21 at 22:08
  • @1201ProgramAlarm. exponents can be integer negative and positive, and also double. –  Oct 24 '21 at 22:12
  • The function variant has the wrong return type: inline *int* pow(double x, int n), it may be that it is causing a slowdown due to conversion double->int->double. – Hans Olsson Oct 25 '21 at 07:19
  • 1
    Did you use `if (typeid(n) != typeid(n))` instead of `if constexpr (is_floating_point_v)` when you timed the third option? When do you expect `typeid(n) != typeid(n)` to be `true`? – Ted Lyngmo Oct 25 '21 at 20:23
  • Please post how you are getting these timing numbers. I ran a simple [benchmark](https://quick-bench.com/q/xmHRUiHlX1V3_vx4nRa2ER11_MI) to test this, and found very little difference between the macro approach and function template (as to be expected). Within a typical error-margin. Macros are very seldom (see: never) superior. – Human-Compiler Oct 30 '21 at 18:40

4 Answers4

14

If you only need to switch between floating point types or not you can use templates instead of macros.

#include <cassert>
#include <cmath>
#include <type_traits>

namespace my_math
{
    template<typename type_t>
    inline double pow(const double x, type_t n)
    {
        // this is compile time checking of types
        // don't use the rtti thing you are trying to do 
        if constexpr (std::is_floating_point_v<type_t>)
        {
            return std::pow(x, n);
        }
        else
        {
            double value = 1;
            for (type_t i = 0; i < n; ++i) value *= x;
            return value;
        }
    };

}

int main()
{
    assert(my_math::pow(2, 0) == 1);
    assert(my_math::pow(2.0, 1) == 2.0);
    assert(my_math::pow(3.0, 2.0) == 9.0);
    assert(my_math::pow(4.0f, 3.0f) == 64.0f);

   return 0;
}
Pepijn Kramer
  • 9,356
  • 2
  • 8
  • 19
1

Once you are sure your pow() is used, hereby a proposal to make your pow() function even better.

This idea might be difficult to implement, but in case you are regularly taking high powers, it might be worth-wile, let me show you an example:

You want to calculate pow(a,17).
Using your system, you need 16 multiplications.

Now let's turn 17 into binary, you get 10001, which means that, after some calculations, you can write pow(a,17) as:

square(square(square(square(a))))*a, where square(a) = a*a

This leaves you with just 5 multiplications and might cause a performance increase. In fact, it goes from O(n) to O(log n) (where n is the exponent).

Edit

Let me show you have you can accomplish this: imagine you need to calculate the 25th power of a number n. Then, first, you need to know the amount of times you need squares, using the simple formula:

a = round_down(log(25) / log(2)) = 4

So, you need all squares, from 0 to 4, and you create following array (sqr(n) stands for the square of n):

[1, n, sqr(n), sqr(sqr(n)), sqr(sqr(sqr(n))), sqr(sqr(sqr(sqr(n))))] with meanings:
 0, 1, 2,      4,           8,                16

You need the last part (the 16th power), you are left with 9, which is larger than 8.
You need the 8, you are left with 1, which is smaller than 4. You don't need the 4.
You don't need the 2.
You need the 1, you are left with 0, and here the loop stops.

So: n^25 = n * sqr(sqr(sqr(n))) * sqr(sqr(sqr(sqr(n))))

(I admit, the explanation is not 100%, but you catch the idea)

Dominique
  • 16,450
  • 15
  • 56
  • 112
  • @Yunnosch: let's await the reaction from the author. – Dominique Oct 21 '21 at 11:57
  • So the author is asking for a performance improver, and I transform an O(n) function into an O(log n), what's the problem here? – Dominique Oct 21 '21 at 12:15
  • @Yunnosch: I edited my answer accordingly. – Dominique Oct 21 '21 at 12:17
  • @Yunnosch: you really need to read the question well, especially the first three, four sentences: the author has been profiling (measuring the performance), and as such is interested in the increase of performance of the `pow` function, which is used quite often. In top of that, the author clearly states only being interested in `pow` function with integer exponent, which is exactly what my answer is covering. – Dominique Oct 21 '21 at 14:40
  • I did my best to read closely. It just seems that we are reading the same thing differently. That happens. I can somehow see your way of reading it, though I stay convinced of my way. I did not and will not downvote or flag. You actually have my respect for defending your point and staying with your answer. Have fun. – Yunnosch Oct 21 '21 at 14:47
  • @Yunnosch: Oh now I understand your point: the author wants to be sure that his own `pow()` is used, while my answer then means `Once you are sure your "pow()" is used, hereby a proposal to make your "pow()" function even better.`, so we don't contradict each other, we are completing each other. – Dominique Oct 21 '21 at 14:55
  • I can live with that. I recommend to edit that into your answer. It should make other voters less trigger-happy.... ;-) – Yunnosch Oct 21 '21 at 14:57
0

Instead of crafting a hand-tuned solution, try using compiler optimizations: -O3 -ffast-math or -O2 -ffast-math

Take a look at this answer.

What is more efficient? Using pow to square or just multiply it with itself?

and

If you are using double precision (double), std::pow(x, n) will be slower than the handcrafted equivalent unless you use -ffast-math, in which case, there is absolutely no overhead.

C++11 Performance tip: When to use std::pow?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
lxkarthi
  • 336
  • 4
  • 14
  • Side note: Don't use macro unless you need pre-processor functionalities. Use functions or constexpr functions instead. – lxkarthi Oct 27 '21 at 18:04
-1

Use the c++ standard https://www.cplusplus.com/reference/ctgmath/ for macro pow version. or use constexpr statement for a pow function instead of a macro.

Anis Belaid
  • 304
  • 2
  • 8
  • 2
    This is a bad idea. "tgmath" is a C99 hack to introduce a workaround for overloading. C++ had overloading out of the box. – MSalters Oct 21 '21 at 12:35
  • This answer doesn't even address the problem in the question, which is trying to avoid the cost of calling `std::pow`. This suggestion just calls the C-version of `pow`, which may still be the exact same version in the implementation. How has this even received any upvotes at all? – Human-Compiler Oct 29 '21 at 22:13