8

GCC provides a way to optimize a function/section of code selectively with fast-math using attributes. Is there a way to enable the same in Clang with pragmas/attributes? I understand Clang provides some pragmas to specify floating point flags. However, none of these pragmas enable fast-math.

PS: A similar question was asked before but was not answered in the context of Clang.

1 Answers1

0

UPD: I missed that you mentioned pragmas. The Option1 is fast math for one op as far as I understood. I am not sure about flushing subnormals, I hope it's not affected.

I didn't find the per function options, but I did find 2 pragmas that can help.

Let's say we want dot product.

Option 1.

float innerProductF32(const float* a, const float* b, std::size_t size) {
  float res = 0.f;

  for (std::size_t i = 0; i != size; ++i) {
#pragma float_control(precise, off)
    res += a[i] * b[i];
  }
  return res;
}

Option2:

float innerProductF32(const float* a, const float* b, std::size_t size) {
  float res = 0.f;

  _Pragma("clang loop vectorize(enable) interleave(enable)")
  for (std::size_t i = 0; i != size; ++i) {
    res += a[i] * b[i];
  }
  return res;
}

The second one is less powerful, it does not generate fma instructions, but maybe it's not what you want.

Denis Yaroshevskiy
  • 1,218
  • 11
  • 24