1

I'd like to mess around with some AVX intrinsic functions. I'd like gcc to use AVX exclusively if possible similar to /arch:AVX in visual studio. Is there a way to do this in gcc with mex?

I tried using something like:

mex -g -O $CFLAGS='$CFLAGS -march=corei7-avx' ncorr_alg_rgdic.cpp standard_datatypes.o ncorr_datatypes.o

But the compiler says eval: 1: = -march=corei7-avx: not found. Does anyone know which flag I should use and how to get mex to accept it? By default it seems to be using SSE instructions (looking at the assembly output I see some mulsds) but I don't want to mix SSE with AVX as I've read here that it can cause problems.

EDIT1:

I'm using ubuntu 11.04 with gcc 4.6.1.

EDIT2:

Compiling with: mex CXXOPTIMFLAGS='-mtune=corei7-avx -S' ncorr_alg_rgdic.cpp standard_datatypes.o ncorr_datatypes.o

Yields:

movsd   -304(%rbp), %xmm1
movsd   .LC16(%rip), %xmm0
mulsd   %xmm0, %xmm1

Compiling with: mex CXXOPTIMFLAGS='-mavx -S' ncorr_alg_rgdic.cpp standard_datatypes.o ncorr_datatypes.o and mex CXXOPTIMFLAGS='-march=corei7-avx -S' ncorr_alg_rgdic.cpp standard_datatypes.o ncorr_datatypes.o Both yield:

vmovsd  -304(%rbp), %xmm1
vmovsd  .LC16(%rip), %xmm0
vmulsd  %xmm0, %xmm1, %xmm1

Now I'm pretty sure mulsd is an sse instruction. Is vmulsd an AVX instruction (strangely googling it didn't yield any results)? I also don't see an ymm registers being used which is strange.

Community
  • 1
  • 1
JustinBlaber
  • 4,629
  • 2
  • 36
  • 57
  • It should be just `-mavx`. – Mysticial Apr 22 '13 at 20:26
  • @Mysticial If I use `$CFLAGS='$CFLAGS -mavx'`, the matlab terminal says `eval: 1: = -mavx: not found`... This may be because I'm not passing the compiler flag properly to mex. – JustinBlaber Apr 22 '13 at 20:32
  • What platform are you targeting? On Windows, `CFLAGS` should be replaced with `COMPFLAGS`. Also, the `-g` option disables optimizations, and as such is not compatible with the `-O` option. GCC may not emit AVX instructions unless you're compiling with a minimum level of optimizations enabled. Remove `-g`, and add `-v` to get mex to print out the compiler invocation command line. Look through that make sure the appropriate optimization level is enabled. – Praetorian Apr 22 '13 at 20:32
  • @Praetorian I was keeping the `-g` flag because the way I had been viewing assembly (to unsure AVX instructions are being used) was by using gdb. Is `-O` really not compatible with `-g`? If I used both the file runs at about `-O` speeds but also allows me to debug it. – JustinBlaber Apr 22 '13 at 20:39
  • 1
    Hmm, maybe it doesn't. The documentation is confusing. `-g` description: *Create ... additional symbolic information for use in debugging. This option disables the mex default behavior of optimizing built object code*. `-O` description: *Optimize the object code. ... If the -g option appears without the -O option, optimization is disabled.* You should be able to view assembly by passing the `-S` switch to GCC (once you figure out your CFLAGS problem :-)) – Praetorian Apr 22 '13 at 20:44
  • @Praetorian thanks for the `-S` switch. Did not know about this!! I also added some stuff to my post in edit2. – JustinBlaber Apr 22 '13 at 21:07
  • @Mysticial I used `-mavx` and it resulted in a `vmulsd` instruction instead of a `mulsd`. Would you happen to know why no `ymm` registers appear to be used? – JustinBlaber Apr 22 '13 at 21:09
  • 1
    @jucestain Yes. That's normal. There's no need to use the full `ymm` registers when you're only doing scalar operations. – Mysticial Apr 22 '13 at 21:27
  • @Mysticial Thanks, I think the way the code is written now it'll be hard for the compiler to identify vector operations. I'm going to experiment with some intrinsics just as a learning experience to see if I can get any results. Thanks a bunch. – JustinBlaber Apr 22 '13 at 21:31
  • Yeah, I usually don't rely on compiler vectorization at all. It fails on all the complicated cases, and the simple cases are rarely performance bottlenecks. – Mysticial Apr 22 '13 at 21:33
  • @Mysticial Interesting. Very excited to try this stuff out. – JustinBlaber Apr 22 '13 at 21:37

1 Answers1

2

what I've found is that mex uses this format:

mex -v CFLAGS='$CFLAGS -Wall' LDFLAGS='$LDFLAGS -w' yprime.c

you should try and remove the first $ sign. -mtune=corei7-avx should be right, though.

andrjas
  • 260
  • 1
  • 5
  • +1 This solved part of the problem. I just need to figure out now if AVX instructions are actually being used now. – JustinBlaber Apr 22 '13 at 21:13
  • it probably does. according to wikipedia AVX introduces a 3-operand form of instructions (vmulsd %xmm0, %xmm1, %xmm1). the new AVX registers ymmX are simply an extension to xmmX from 128 to 256 bits. if your code doesn't need that much precision the compiler doesn't care. btw: could it be, that after your 2nd edit some other optimizations were removed? (like -O3) – andrjas Apr 22 '13 at 21:24
  • It appears that if I use `CXXOPTIMFLAGS='$CXXOPTIMFLAGS'`, it deletes the `-O` compiler flag. My understanding is that the `$CXXOPTIMFLAGS` is supposed to append flags which are already defined in `CXXOPTIMAFLAGS` which is `-O -DNDEBUG`. – JustinBlaber Apr 22 '13 at 21:32
  • Just as a final note, I realized through the `mex` documentation that you need to do `mex CXXOPTIMFLAGS="\$CXXOPTIMFLAGS -mavx" ncorr_alg_rgdic.cpp standard_datatypes.o ncorr_datatypes.o`. The backslash and `"` is required in order to append `-mavx` to the default settings when compiling the `mex` file through the matlab terminal. – JustinBlaber Apr 23 '13 at 00:27