0

What I really want to do is to compare the performance of different algorithms which solve the same task in different ways. Such algorithms, in my example called apply_10_times have sub algorithms, which shall be switchable, and also receive template arguments. They are called apply_x and apply_y in my example and get int SOMETHING as template argument.

I think the solution would be to specify a template function as template parameter to another template function. Something like this, where template_function is of course pseudo-code:

template<int SOMETHING>
inline void apply_x(int &a, int &b) {
    // ...
}

template<int SOMETHING>
inline void apply_y(int &a, int &b) {
    // ...
}

template<template_function APPLY_FUNCTION, int SOMETHING>
void apply_10_times(int &a, int &b) {
    for (int i = 0; i < 10; i++) {
        cout << SOMETHING; // SOMETHING gets used here directly as well

        APPLY_FUNCTION<SOMETHING>(a, b);
    }
}

int main() {
    int a = 4;
    int b = 7;
    apply_10_times<apply_x, 17>(a, b);
    apply_10_times<apply_y, 19>(a, b);
    apply_10_times<apply_x, 3>(a, b);
    apply_10_times<apply_y, 2>(a, b);

    return 0;
}

I've read that it's not possible to pass a template function as a template parameter, so I can't pass APPLY_FUNCTION this way. The solution, afaik, is to use a wrapping struct, which is then called a functor, and pass the functor as a template argument. Here is what I got with this approach:

template<int SOMETHING>
struct apply_x_functor {
    static inline void apply(int &a, int &b) {
        // ...
    }
};

template<int SOMETHING>
struct apply_y_functor {
    static inline void apply(int &a, int &b) {
        // ...
    }
};

template<typename APPLY_FUNCTOR, int SOMETHING>
void apply_10_times(int &a, int &b) {
    for (int i = 0; i < 10; i++) {
        cout << SOMETHING; // SOMETHING gets used here directly as well
        APPLY_FUNCTOR:: template apply<SOMETHING>(a, b);
    }
}

This approach apparently works. However, the line APPLY_FUNCTOR:: template apply<SOMETHING>(a, b); looks rather ugly to me. I'd prefer to use something like APPLY_FUNCTOR<SOMETHING>(a, b); and in fact this seems possible by overloading the operator(), but I couldn't get this to work. Is it possible and if so, how?

Daniel S.
  • 6,458
  • 4
  • 35
  • 78
  • @YvesDaoust yes, but then the sub algorithms apply_x and apply_y cannot be inlined. Or can they? – Daniel S. Aug 16 '22 at 08:30
  • Couldn't you pass the function to `apply_10_times()` and that functions type be a template type? – oraqlle Aug 16 '22 at 09:24
  • Considered passing a lambda function instead, and pass it as a parameter template apply_10_times(int a, int b, Func func); apply_10_times(a, b, [](int a, int b){ ... }); With all the inlining, the performance may not be very different. – Malcolm Aug 16 '22 at 09:34
  • @YvesDaoust your idea, your job to check. The functor approach can be inlined, because it's known at compile time, which exact function is called. With the pointers it's not necessarily known. – Daniel S. Aug 16 '22 at 09:37
  • why do you need `APPLY_FUNCTION` and `SOMETHING` as separate template arguments? Why do you need them as template arguments at all? `apply_10_times` could just take any callable as parameter and call it. – 463035818_is_not_an_ai Aug 16 '22 at 09:44
  • @463035818_is_not_a_number i want inlining for performance reasons. apply_x and apply_y are rather small functions which get called a lot of times, so inlining them is crucial. – Daniel S. Aug 16 '22 at 09:46
  • @463035818_is_not_a_number maybe APPLY_FUNCTION and SOMETHING don't need to be separate. I didn't find the right syntax to combine them. I'll check out your answer. Maybe this is what I need. – Daniel S. Aug 16 '22 at 09:56
  • @463035818_is_not_a_number Ah, now I understand what you mean. Yes, `apply_10_times()` also uses `SOMETHING` directly. Sorry, I simplified the code in the question too much. – Daniel S. Aug 16 '22 at 10:03
  • @463035818_is_not_a_number I can, however, also pass it another time separately. It's not beautiful, because the caller has to make sure both match, – Daniel S. Aug 16 '22 at 10:05
  • 1
    @DanielS.: the pointers can be known at compile-time exactly the same way. –  Aug 16 '22 at 10:09
  • @DanielS. see edit, you do not need to specify them separately. You can deduce `SOMETHING` from a `apply_x` – 463035818_is_not_an_ai Aug 16 '22 at 10:10
  • @YvesDaoust well, I want my problem to be solved, agreed, but the way you formulated the first time sounds like this was just a rough idea and you don't really know what you are suggesting. So I wouldn't follow it, because this doesn't sound promising. I'll try it anyways. – Daniel S. Aug 16 '22 at 10:25

2 Answers2

2

As it is not clear why you need APPLY_FUNCTION and SOMETHING as separate template arguments, or why you need them as template arguments at all, I'll state the obvious solution, which maybe isn't applicable to your real case, but to the code in the question it is.

#include <iostream>

template<int SOMETHING>
inline void apply_x(int a, int b) {
    std::cout << a << " " << b;
}

template<int SOMETHING>
inline void apply_y(int a, int b) {
    std::cout << a << " " << b;
}

template<typename F>
void apply_10_times(int a, int b,F f) {
    for (int i = 0; i < 10; i++) {
        f(a, b);
    }
}

int main() {
    int a = 4;
    int b = 7;
    apply_10_times(a, b,apply_x<17>);
    apply_10_times(a, b,apply_y<24>);
}

If you want to keep the function to be called as template argument you can use a function pointer as non-type template argument:

template<void(*F)(int,int)>
void apply_10_times(int a, int b) {
    for (int i = 0; i < 10; i++) {
        F(a, b);
    }
}

int main() {
    int a = 4;
    int b = 7;
    apply_10_times<apply_x<17>>(a, b);
    apply_10_times<apply_y<24>>(a, b);
}

In any case I see no reason to have APPLY_FUNCTION and SOMETHING as separate template arguments. The only gain is more complex syntax which is exactly what you want to avoid. If you do need to infer SOMETHING from an instantiation of either apply_x or apply_y, this is also doable without passing the template and its argument separately, though again you'd need to use class templates rather than function templates.


PS:

Ah, now I understand what you mean. Yes, apply_10_times() also uses SOMETHING directly. Sorry, I simplified the code in the question too much.

As mentioned above. This does still not imply that you need to pass them separately. You can deduce SOMETHING from a apply_x<SOMETHING> via partial template specialization. This however requires to use class templates not function templates:

#include <iostream>

template <int SOMETHING>
struct foo {};

template <int X>
struct bar {};

template <typename T>
struct SOMETHING;

template <template <int> class T,int V>
struct SOMETHING<T<V>> { static constexpr int value = V; };

int main() {
    std::cout << SOMETHING< foo<42>>::value;
    std::cout << SOMETHING< bar<42>>::value;
}
463035818_is_not_an_ai
  • 109,796
  • 11
  • 89
  • 185
  • I think we're getting close to what I'm actually after in the PPS, but I'm getting confused now. Is `SOMETHING` in the PPS used twice, once as an `int` literal and once as a `struct`? – Daniel S. Aug 16 '22 at 10:15
  • Do you use the words "class" and "struct" synonymously in the PPS? ("This however requires to use class templates not function templates") – Daniel S. Aug 16 '22 at 10:18
  • My problem with the PPS is that `foo` needs to be replaceable in `struct SOMETHING>`. There is `apply_x` and `apply_y` as candidates for `foo`. – Daniel S. Aug 16 '22 at 10:20
  • @DanielS. oh, i missed that. It is still possible. Just a moment... – 463035818_is_not_an_ai Aug 16 '22 at 10:32
  • 2
    @DanielS. both `struct` and `class` declare a class type. The only difference is that with `struct` the default access modifier is public whilst with `class` is private. There is no way in the (template) type system to differentiate between a class declared with the `struct` keyword and a class declared with the `class` keyword because once they've been declared there is no difference (encoded) in the type system. – bolov Aug 16 '22 at 10:33
  • @463035818_is_not_a_number you're in ;) -- apply_x and apply_y have just one int template argument. – Daniel S. Aug 16 '22 at 10:36
  • @DanielS. but note that I am using this only on the trait. There is no reason to have `APPLY_FUNCTION` and `SOMETHING` separately anywhere else when anyhow it always refers to some `APPLY_FUNCTION` – 463035818_is_not_an_ai Aug 16 '22 at 10:39
  • Can you explain this line `template – Daniel S. Aug 16 '22 at 10:43
  • @DanielS. https://stackoverflow.com/questions/213761/what-are-some-uses-of-template-template-parameters – 463035818_is_not_an_ai Aug 16 '22 at 10:45
  • It#s still irritating to me that once it's `struct SOMETHING` and once `int SOMETHING`. Do they need to be called the same name? – Daniel S. Aug 16 '22 at 10:54
  • no. It can be `template struct foo {};`. – 463035818_is_not_an_ai Aug 16 '22 at 10:54
  • I've applied the concept of the PPS part and I'm quite happy with it. Thanks for your patience! – Daniel S. Aug 16 '22 at 11:30
1

What I really want to do is to compare the performance of different algorithms which solve the same task in different ways.

You should provide more details about that.

Your first step should be get familiar with Google Benchmark. There is as site which provides it online. This tool give proper patterns for your scenario.

In next step you must be aware that in C and C++ there is "as if rule" which allows optimizer do do wonderful things, but makes creation of good performance test extremely difficult. It is easy write test which doesn't measure actual production code.

Here is cppcon talk showing how many traps are hidden when doing a good performance test fro C++ code. So be very very careful.

Marek R
  • 32,568
  • 6
  • 55
  • 140
  • Thank you for your worthy suggestions, which I will definitely also follow. However, it seems that this is quite an overkill for the task I'm trying to get done right now and it's not really an answer for the question, as it's more generally targeted towards performance testing. My problem is that I have composite algorithms which can be combined in different ways. In other words, I want to use some kind of inversion of control and make sure inlining can still be done. – Daniel S. Aug 16 '22 at 10:31