Enable fast-math in Clang on a per-function basis?

Question

GCC provides a way to optimize a function/section of code selectively with fast-math using attributes. Is there a way to enable the same in Clang with pragmas/attributes? I understand Clang provides some pragmas to specify floating point flags. However, none of these pragmas enable fast-math.

PS: A similar question was asked before but was not answered in the context of Clang.

Denis Yaroshevskiy · Answer 1 · 2023-02-08T00:56:12.927

UPD: I missed that you mentioned pragmas. The Option1 is fast math for one op as far as I understood. I am not sure about flushing subnormals, I hope it's not affected.

I didn't find the per function options, but I did find 2 pragmas that can help.

Let's say we want dot product.

Option 1.

float innerProductF32(const float* a, const float* b, std::size_t size) {
  float res = 0.f;

  for (std::size_t i = 0; i != size; ++i) {
#pragma float_control(precise, off)
    res += a[i] * b[i];
  }
  return res;
}

Option2:

float innerProductF32(const float* a, const float* b, std::size_t size) {
  float res = 0.f;

  _Pragma("clang loop vectorize(enable) interleave(enable)")
  for (std::size_t i = 0; i != size; ++i) {
    res += a[i] * b[i];
  }
  return res;
}

The second one is less powerful, it does not generate fma instructions, but maybe it's not what you want.

Enable fast-math in Clang on a per-function basis?

1 Answers1