-march
and -m32
/ -m64
gcc options are orthogonal. 64-bit mode doesn't support pushl
.
gcc -march=i486
doesn't imply -m32
. Thus gcc -m32
is necessary, to invoke as --32
.
Also, GCC doesn't pass on its -march=
option to GAS, using it only for C->asm compilation.
By default, GAS accepts any instructions it knows about. So gcc -m32 -c bswap.s
works, and would also accept AVX512VBMI instructions like vpmultishiftqb (%ecx){1to8}, %zmm0, %zmm1
(broadcast-load and bitfield-extract) without further options.
This is basically opposite of how GCC works when compiling C to asm, where it has a default target baseline (e.g. for 32-bit mode, often i686 or i686 + SSE2, allowing instructions like CMOV).
This makes some sense because in asm, instruction choice is governed by the source. If you don't want to use new instructions for compat with old CPUs, that's up to you. But for GCC, where a machine is generating asm, you might want portable binaries that can run on any CPU, or any CPU newer than some baseline. Or a binary that will use everything your CPU has (-march=native
), avoiding instructions your CPU doesn't support.
If you use new instructions via inline asm, you can still compile with gcc
without a -march
option. (But normally it's better to use intrinsics to have GCC emit those instructions itself, so it knows what's going on.)
If you want to tell GAS to impose limits, e.g. to catch mistakes like accidentally using cmov
or cmpxchg8b
when you intended your code to be able to run on a 486, its as -march=i486
option or .arch i486
directive in the source supports that.
(See the GAS manual; the microarchitecture names are similar to what gcc -march=
accepts, except for recent Intel where GCC accepts skylake
, but GAS would need corei7.avx2.fma.movbe.bmi2
or something, and that's still incomplete.)
To get GCC to run as --32 -march=i486
, you use
gcc -c -m32 -Wa,-march=i486 foo.s
If you omit the -m32
, you get Assembler messages:
Fatal error: 64bit mode not supported on 'i486'.
Fun fact: GAS has lots of other x86 options that GCC doesn't set. I'm showing the gcc -Wa,gas-option
form; if you were running as --32
directly, you'd use just the as --32 -Os
or whatever.
gcc -Wa,-Os
- optimize your asm for size, e.g. shortening mov $1, %rax
to mov $1, %eax
because that's architecturally equivalent, or test $1, %eax
(5 bytes) to test $1, %al
(2 bytes).
gcc -Wa,-mbranches-within-32B-boundaries
- How can I mitigate the impact of the Intel jcc erratum on gcc?
gcc -Wa,-msse2avx
- encode SSE instructions with VEX prefix.
gcc -Wa,-muse-unaligned-vector-move
- translate movaps
to movups
and so on. (But it can't transparently turn paddb (%ecx), %xmm0
into something that doesn't require alignment, so it's probably only useful with AVX, if you want to relax the alignment requirements for a function. In AVX, only vmovaps
/vmovdqa
load/store do alignment enforcement, memory source operands for ALU instructions are like vmovups
)
I've never really wanted to use any of these options (except the workaround for Skylake's JCC-erratum performance pothole), but it's neat that they exist.