I'm using OpenMP to parallelize the loop, that is internally using AVX-512 with Agner Fog's VCL Vector Class Library.
Here is the code:
double HarmonicSeries(const unsigned long long int N) {
unsigned long long int i;
Vec8d divV(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0);
Vec8d sumV(0.0);
const Vec8d addV(8.0);
const Vec8d oneV(1.0);
#pragma omp parallel for reduction(+:sumV,divV)
for(i=0; i<N; ++i) {
sumV += oneV / divV;
divV += addV;
}
return horizontal_add(sumV);
}
When trying to compile the code above, I'm getting
g++ -Wall -Wextra -O3 -g -I include -fopenmp -m64 -mavx2 -mfma -std=c++17 -o harmonic_series harmonic_series.cpp
harmonic_series.cpp:87:40: error: user defined reduction not found for ‘sumV’
87 | #pragma omp parallel for reduction(+:sumV,divV)
| ^~~~
harmonic_series.cpp:87:45: error: user defined reduction not found for ‘divV’
87 | #pragma omp parallel for reduction(+:sumV,divV)
Any hints on how to solve this and provide the user-defined reduction for the Vec8d
class? It's simply the plus operator which is defined by the VCL class, but I cannot find any example how to code this.
Thanks a lot for any help!