In order to compare which flags various -march
settings will enable, I am comparing the outputs of the following commands, as detailed in this SO answer:
$ gcc -Q -march=native --help=target
$ gcc -Q -march=skylake-avx512 --help=target
Please note, for the avoidance of doubt, the detected arch output from using -march=native
is skylake-avx512
.
$ gcc -Q -march=native --help=target | grep march
-march= skylake-avx512
Most of the flags the two -march
variants output match exactly.
However, there are a few differences:
$ diff <(gcc -Q -march=native --help=target) <(gcc -Q -march=skylake-avx512 --help=target)
12c12
< -mabm [enabled]
> -mabm [disabled]
119c119
< -mpku [disabled]
> -mpku [enabled]
136c136
< -mrtm [enabled]
> -mrtm [disabled]
138c138
< -msgx [disabled]
> -msgx [enabled]
It is these differences which have prompted me to ask this question.
How does -march=native
choose which instruction sets to enable and which to disable?
I have the following conjecture:
-march=native
will be using CPUID instructions to calculate supported instruction sets etc in order to detect the processor variant-march=foobar
will use a hardcoded list of instruction sets which processorfoobar
supports.
If that is correct then I can see two possible ways this shakes out:
Option 1:
It is possible that -march=native
may not get it 100% correct, whereas when a new processor is released, the table of supported instruction sets is updated, and is more likely to be correct.
Therefore we would expect -march=foobar
to be the "more correct" flag.
Option 2:
-march=native
will be using CPUID instructions to calculate supported instruction sets - and is therefore guaranteed to be correct, whereas -march=foobar
will use a hardcoded list of instruction sets which may not be correct.
Therefore we would expect -march=native
to be the "more correct" flag.
If Option 2 is correct, one could surmise that using -march=foobar
could end up with an unsupported instruction set enabled - and if the program were to emit these instructions result in a crash.
I have thus far been unsuccessful in finding the answer as to whether either or any of the above is correct.
If I want to target a specific arch, be sure all (and only) supported instruction sets are enabled, and am unable to use -march=native
, what is the best way to do this?