I recently tried to add the -ftree-vectorize
compiler option to the build step of my project. I ran tests after the change, and for some reason when compiling with Clang, one of my end-to-end tests fails because of significant floating point differences. This is quite surprising to me, especially since GCC still works fine.
Digging into the issue a bit, I find that from GCC documentation -ftree-vectorize
turns on 2 separate flags, -ftree-loop-vectorize
and -ftree-slp-vectorize
. Clang in its effort to match GCC of course has the -ftree-vectorize
flag, and also has the -ftree-slp-vectorize
flag. I tested Clang with just the ftree-slp-vectorize
option, and the test passes.
However, it doesn't have the -ftree-loop-vectorize
option so I can't try it with just that, and judging from Clang documentation I don't even know if -ftree-vectorize
turns on 2 flags under the hood like GCC.
I am rather stumped on how vectorization can affect floating-point results. I know that floating point operations aren't associative, so I don't believe Clang would break the as-if rule by vectorizing floating point operations. I'm pretty sure that I didn't switch on unsafe math optimizations either, since my compiler options are just -g -O2 -ftree-vectorize
.
Worst case here I've struck some implementation behavior/UB, but I wanted to ask here first to people more experienced about if there could be something else I'm missing.