I struggling with Clang and GCC not utilizing the vectorized version of sincos() in libmvec when vectorizing a loop with sin() and cos(). This is related to Vectorization of sin and cos from 6 years ago.
void func( float * p, int n, float a, float b )
{
n = 8*4;
for( int i = 0; i < n; i++ )
{
#pragma omp simd safelen(8) simdlen(8) aligned(p)
for( int j = 0; j < n; j++ )
{
float angle = a*j+b*i;
p[j+i*n] += a * sinf( angle ) + b * cosf( angle );
}
}
}
// gcc: -O3 -march=haswell -fopt-info-vec-missed -fopenmp-simd -ffast-math
// clang: -O3 -march=haswell -fopenmp-simd -ffast-math -fveclib=libmvec
// icc: -O3 -march=haswell
The assembler output for gcc, clang, and ICC can be found at godbolt:
GCC: https://gcc.godbolt.org/z/TnKz3YvqM does no vectorization whatsoever but only calls sincosf in libm, which is a known bug that seems to be forgotten https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70901.
Clang: https://gcc.godbolt.org/z/7db9xMTf6 Clang vectorizes sin() and cos() separately and calls _ZGVdN8v_sinf _ZGVdN8v_cosf in libmvec.
ICC: https://gcc.godbolt.org/z/5PMnznahv fully vectorizes sin and cos with a call of __svml_sincosf8_l9.
It there any workaround to make clang (and gcc) use libmvec's _ZGVdN8vvv_sincosf?