14

I am compiling my code using following command:

gcc -O3 -ftree-vectorizer-verbose=6 -msse4.1 -ffast-math 

With this all the optimizations are enabled.

But I want to disable vectorization while keeping the other optimizations.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
PhantomM
  • 825
  • 6
  • 17
  • 34

3 Answers3

18

Most of the GCC switches can be used with a no prefix to disable their behavior. Try with -fno-tree-vectorize (after -O3 on the command line).

Mat
  • 202,337
  • 40
  • 393
  • 406
  • 2
    I tried this and still found xmm0 register in my code and calls to __ieee754_exp_avx. @Mat ? any help is welcome. – Hugo Nov 26 '19 at 22:56
  • 2
    @Hugo, you have to differentiate between auto-vectorization and the usage of SIMD instructions. You can try `-mno-sse`, `-mno-avx` and similar options to tell the compiler avoid emitting any SIMD code. – maxschlepzig Sep 26 '20 at 18:37
  • 1
    @maxschlepzig: note that x86-64 uses XMM registers as part of the calling convention for scalar float / double. With `-mno-sse` you'd need to totally avoid any FP math (at least in function call / return). For kernel code or something, avoiding any FP math is generally sufficient to avoid any x87 instructions *within* functions, and GCC doesn't auto-vectorize with MMX instructions even when SSE2 isn't available, so you generally don't need `-mno-mmx`. – Peter Cordes Nov 02 '20 at 21:35
9

you can also selectively enable and disable vectorization with the optimize function attributes or pragmas

http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html

http://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html

e.g.

__attribute__((optimize("no-tree-vectorize")))
void f(double * restrict a, double * restrict b)
{
    for (int i = 0; i < 256; i++)
        a[i] += b[i];
}
jtaylor
  • 2,389
  • 19
  • 19
1

Excellent, now that gcc has become more aggressive at vectorizing e.g.

extern "C" __attribute__((optimize("no-tree-vectorize")))
/* Subroutine */
int s111_ (integer * ntimes, integer * ld, integer * n,
           real * ctime, real * dtime,
           real * __restrict a, real * b, real * c__, real * d__,
           real * e, real * aa, real * bb, real * cc)
{
    ....
    for (i__ = 2; i__ <= i__2; i__ += 2)
        a[i__] = a[i__ - 1] + b[i__];
    ....

In the case posted above, removing restrict used to do the job, but now g++ 6.0 can't be stopped from vectorizing by removing __restrict.

chus
  • 1,577
  • 15
  • 25
tim18
  • 580
  • 1
  • 4
  • 8