2

I have created a CPU dispatcher which compiles the same functions with different compile options into different object files. In order for my code to access the same functions in different object files I need to give the functions in each object file a different name.

In C (or C++) I would do something like this in the header file for the declarations of the function.

typedef float MyFuncType(float a);

MyFuncType  myfunc_SSE2, myfunc_SSE41, myfunc_AVX, myfunc_AVX2, myfunc_AVX512

But now I want function templates for the declarations. My real code currently looks more like this

//kernel.h
template <typename TYPE, unsigned N, typename VALUES>
void foo_SSE2(int32_t *buffer, VALUES & v);

template <typename TYPE, unsigned N, typename VALUES>
void foo_SSE41(int32_t *buffer, VALUES & v);
...
template <typename TYPE, unsigned N, typename VALUES>
void foo_AVX512(int32_t *buffer, VALUES & v);

#if   INSTRSET == 2                    // SSE2
#define FUNCNAME foo_SSE2
#elif INSTRSET == 5                    // SSE4.1
#define FUNCNAME foo_SSE41
...
#if   INSTRSET == 9                    // AVX512
#define FUNCNAME foo_AVX512
#endif

These are only declarations in a header file. The function definitions are in a separate source file which is compiled to a different object file for each function name. The definitions look something like this

//kernel.cpp
#include "kernel.h"
template<typename TYPE, unsigned N, typename VALUES>
void FUNCNAME(int32_t *buffer, VALUES & v) {
    //code
}

Then I compile like this

gcc -c -O3 -msse2 kernel.cpp -o kernel_sse2.o
gcc -c -O3 -msse4.1 kernel.cpp -o kernel_sse41.o
...
gcc -c -O3 -mavx512f kernel.cpp -o kernel_avx512.o
gcc -O3 main.cpp kernel_sse2.o kernel_sse41.o ... kernel_avx512.o

The file main.cpp is another source file which only needs to know the function declarations so that the linker can link them to the definitions in the other object files. It looks like this

void dispatch(void) {
    int iset = instrset_detect();
    if (iset >= 9) {
        fp_float1  = &foo_AVX512<float,1>;  
    }
    else if (iset >= 8) {
        fp_float1  = &foo_AVX2<float,1>;
    }
    ...
    else if (iset >= 2) {
        fp_float1  = &foo_SSE2<float,1>;
    }
}
int main(void) {
    dispatch();
    fp_float1(buffer, values);
}

But in my file "kernel.h" it's annoying (and error prone) to repeat this for every change in function name. I want something like the following (which I know does not work).

template <typename TYPE, unsigned N, typename VALUES>
typedef void foo(int32_t *buffer, VALUES & v);

foo foo_SSE2, foo_SSE41, foo_SSE_AVX, foo_AVX2, foo_AVX512

Is there an ideal way to to this which separates the declarations and definitions and allows me to simply rename identical template function declarations?

Z boson
  • 32,619
  • 11
  • 123
  • 226
  • 1
    You can let the preprocessor do the job. – Columbo May 19 '15 at 08:56
  • You may save the argument typing with the template typedef (with `using`) [Live Demo](https://ideone.com/v7wDhj). But can't declare several template functions in a row :/ – Jarod42 May 19 '15 at 09:13
  • 1
    C++11 has `using` keyword for template aliasing – ftynse May 19 '15 at 09:13
  • @Jarod42, thank your for the code! My C++ is supper rusty so I appreciate the help. I think using the preprocessed is the way to do this. I'll try that next. – Z boson May 19 '15 at 09:36
  • @ftynse, if you can post an answer using `using` please do. I would like to see how `using` is going to simply my code. – Z boson May 19 '15 at 15:29
  • @Zboson what are you trying to accomplish? Would an extra template parameter (i.e. an enum specifying SSE2/SSE41/AVX512) be of any use? What are the extra template parameters? Are they really necessary? As it stands, your question is very unclear. Perhaps it would be useful to show how you would want to call these functions, or what kind of interface you desire. – rubenvb May 20 '15 at 11:53
  • @rubenvb, I updated my question and tried to make it more clear. It's probably too long now for anyone to read but I find it difficult to explain. – Z boson May 20 '15 at 12:16

1 Answers1

1

This seems like an application for the preprocessor:

#define EMIT_FUNCTION_PROTOTYPE(func_name, func_suffix) \
    template<typename TYPE, unsigned N, typename VALUES> \
    void func_name ## func_suffix (int32_t *buffer, VALUES & v)

#define EMIT_FUNCTION_PROTOTYPES(func_name) \
    EMIT_FUNCTION_PROTOTYPE(func_name, _SSE2); \
    EMIT_FUNCTION_PROTOTYPE(func_name, _SSE41); \
    EMIT_FUNCTION_PROTOTYPE(func_name, _AVX); \
    EMIT_FUNCTION_PROTOTYPE(func_name, _AVX2); \
    EMIT_FUNCTION_PROTOTYPE(func_name, _AVX512)

Then it's just a one-liner to generate all of the prototypes in your header file:

EMIT_FUNCTION_PROTOTYPES(foo);
// expands to:
//     template <typename TYPE, unsigned N, typename VALUES>
//     void foo_SSE2(int32_t *buffer, VALUES & v);
//
//     template <typename TYPE, unsigned N, typename VALUES>
//     void foo_SSE41(int32_t *buffer, VALUES & v);
//
//     template <typename TYPE, unsigned N, typename VALUES>
//     void foo_AVX(int32_t *buffer, VALUES & v);
//
//     template <typename TYPE, unsigned N, typename VALUES>
//     void foo_AVX2(int32_t *buffer, VALUES & v);
//
//     template <typename TYPE, unsigned N, typename VALUES>
//     void foo_AVX512(int32_t *buffer, VALUES & v);

I don't think this is a huge benefit, but it should give you what you want.

Jason R
  • 11,159
  • 6
  • 50
  • 81