3

I'm planning to build different versions of intense numerical program for x86-64 architectures. Conveniently, in 2020, 4 levels of x86-64 microarchitecture were defined that can be passed to the compiler via the "-march" flag. Thus, for GCC 11 (and similarly for Clang 12), I should be able to use AVX, AVX2, and LZCNT instructions by specifying

gcc -march=x86-64-v3

and expand that to AVX512 by

gcc -march=x86-64-v4

If I create a Debian package out of this, it might be called

mynumerics_22.04_amd64.deb

However, there are really two versions. Running the x86-64-v4 version on a x86-64-v3 machine results in SIGILL (illegal instruction). Some suggested this to me:

mynumerics_22.04_amd64-x3.deb
mynumerics_22.04_amd64-x4.deb

Where x3 stands for x86-64-v3, x4 for x86-64-v4. But I think this violates Debian 'arch' specifier rules. Adding it to the version part of the package name results in:

mynumerics_22.04-x3_amd64.deb
mynumerics_22.04-x4_amd64.deb

But this can be interpreted by the package manager as x4 being a later version. I am thinking about suffixing the name. Thus:

mynumerics-x3_22.04_amd64.deb
mynumerics-x4_22.04_amd64.deb

A similar microarchitecture issue might happen in the ARM aarch64 world. From my reading, ARMv8-A supports "Advanced SIMD (Neon)"; ARMv8.2-A supports "Scalable Vector Extension (SVE)". This might yield,

mynumerics-a8_22.04_arm64.deb
mynumerics-a82_22.04_arm64.deb

Is there a better way to handle microarchitecture level differences?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Out of curiosity, is it not an option to include both implementations and select between them at runtime (e.g. because of size concerns)? – Mona the Monad Jul 06 '22 at 01:01
  • @MonatheMonad For a small number of affected functions, that might be doable. For a library of 100-200 numerical functions, this gets out of hand. In effect, the source code would need to be duplicated (#included ??) for each variant architecture. For GCC, the way to do this is something like: int core2_func (void) __ attribute __ (( __ target __ ("arch=core2"))); int sse3_func (void) __ attribute __ ((__ target __ ("sse3"))); – Justin JRTI Jul 15 '22 at 23:48

0 Answers0