2

Using -mcpu/-march allows enabling a set of extended instructions like sse on x86 or altivec, but when building for the current cpu, this isn t always enough.

For example, passing -mcpu=cascadelake to clang doesn t means enabling bmi or the various avx512 extensions which might be present on a cascade lake cpu.

That s why gcc as an additionnal possibility which is -mtune=native. Using this option will enable all the compiler flags generating the extensions supported by the current host cpu. But what s the equivalent for clang?

user2284570
  • 2,891
  • 3
  • 26
  • 74
  • Given the long list of `clflush dts mmx aes ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req pku ospke avx512_vnni md_clear flush_l1d arch_capabilities` supported by my cpu, I dont want to have searching the options one by one. – user2284570 Dec 15 '20 at 18:07

1 Answers1

2

from the man page man clang

man clang | grep march
       -march=<cpu>
              Specify that Clang should generate code for a specific processor family member and later.  For example, if you specify -march=i486, the compiler is allowed to generate instructions that are valid on i486 and later processors, but which may not exist on earlier ones.
  1. use -march to enable cpu features
  2. use -mtune to optimize for a certain micro architecture.

On my 11th Gen Intel(R) Core(TM) i7-11700KF @ 3.60GHz and clang 12 It seems to value a -march=native flag.

from here:

clang -march=native -E -v - </dev/null 2>&1 | grep cc1 (respectively with gcc)

I do get a different set of flags being activated for compilation.

with -march=native

$ clang -march=native -E -v - </dev/null 2>&1 | grep cc1
 "/usr/lib/llvm-12/bin/clang" -cc1 -triple x86_64-pc-linux-gnu -E -disable-free -disable-llvm-verifier -discard-value-names -main-file-name - -mrelocation-model static -mframe-pointer=all -fmath-errno -fno-rounding-math -mconstructor-aliases -munwind-tables -target-cpu icelake-client -target-feature +sse2 -target-feature -tsxldtrk -target-feature +cx16 -target-feature +sahf -target-feature -tbm -target-feature +avx512ifma -target-feature +sha -target-feature +gfni -target-feature -fma4 -target-feature +vpclmulqdq -target-feature +prfchw -target-feature +bmi2 -target-feature -cldemote -target-feature +fsgsbase -target-feature -ptwrite -target-feature -amx-tile -target-feature -uintr -target-feature +popcnt -target-feature -widekl -target-feature +aes -target-feature +avx512bitalg -target-feature -movdiri -target-feature +xsaves -target-feature -avx512er -target-feature -avxvnni -target-feature +avx512vnni -target-feature -amx-bf16 -target-feature +avx512vpopcntdq -target-feature -pconfig -target-feature -clwb -target-feature +avx512f -target-feature +xsavec -target-feature -clzero -target-feature +pku -target-feature +mmx -target-feature -lwp -target-feature +rdpid -target-feature -xop -target-feature +rdseed -target-feature -waitpkg -target-feature -kl -target-feature -movdir64b -target-feature -sse4a -target-feature +avx512bw -target-feature +clflushopt -target-feature +xsave -target-feature +avx512vbmi2 -target-feature +64bit -target-feature +avx512vl -target-feature -serialize -target-feature -hreset -target-feature +invpcid -target-feature +avx512cd -target-feature +avx -target-feature +vaes -target-feature -avx512bf16 -target-feature +cx8 -target-feature +fma -target-feature -rtm -target-feature +bmi -target-feature -enqcmd -target-feature +rdrnd -target-feature -mwaitx -target-feature +sse4.1 -target-feature +sse4.2 -target-feature +avx2 -target-feature +fxsr -target-feature -wbnoinvd -target-feature +sse -target-feature +lzcnt -target-feature +pclmul -target-feature -prefetchwt1 -target-feature +f16c -target-feature +ssse3 -target-feature -sgx -target-feature -shstk -target-feature +cmov -target-feature +avx512vbmi -target-feature -amx-int8 -target-feature +movbe -target-feature -avx512vp2intersect -target-feature +xsaveopt -target-feature +avx512dq -target-feature +adx -target-feature -avx512pf -target-feature +sse3 -fno-split-dwarf-inlining -debugger-tuning=gdb -v -resource-dir /usr/lib/llvm-12/lib/clang/12.0.0 -internal-isystem /usr/local/include -internal-isystem /usr/lib/llvm-12/lib/clang/12.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdebug-compilation-dir /home/joel -ferror-limit 19 -fgnuc-version=4.2.1 -faddrsig -o - -x c -
$ clang -march=native -E -v - </dev/null 2>&1 | grep cc1 | wc
      2     228    2870

without -march=native

$ clang -E -v - </dev/null 2>&1 | grep cc1 
 "/usr/lib/llvm-12/bin/clang" -cc1 -triple x86_64-pc-linux-gnu -E -disable-free -disable-llvm-verifier -discard-value-names -main-file-name - -mrelocation-model static -mframe-pointer=all -fmath-errno -fno-rounding-math -mconstructor-aliases -munwind-tables -target-cpu x86-64 -tune-cpu generic -fno-split-dwarf-inlining -debugger-tuning=gdb -v -resource-dir /usr/lib/llvm-12/lib/clang/12.0.0 -internal-isystem /usr/local/include -internal-isystem /usr/lib/llvm-12/lib/clang/12.0.0/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdebug-compilation-dir /home/joel -ferror-limit 19 -fgnuc-version=4.2.1 -faddrsig -o - -x c -

$ clang -E -v - </dev/null 2>&1 | grep cc1 | wc
      2      58     799
Joel
  • 1,725
  • 3
  • 16
  • 34