1

I am trying to learn AVX instructions and while running a basic code I recieve

Illegal instruction (core dumped)

The code is mentioned below and I am compiling it using

g++ -mavx512f 1.cpp

What exactly is the problem and how to overcome it? Thank You!

#include <immintrin.h>
#include<iostream>
using namespace std;

void add(const float a[], const float b[], float res[], int n)
{
    int i = 0;

    for(; i < (n&(~0x31)) ; i+=32 )
    {
        __m512 x = _mm512_loadu_ps( &a[i] );
        __m512 y = _mm512_loadu_ps( &b[i] );

        __m512 z = _mm512_add_ps(x,y);
        _mm512_stream_ps(&res[i],z);
    }

    for(; i<n; i++) res[i] = a[i] + b[i];
}

int main()
{
    int n = 100000;
    float a[n], b[n], res[n];
    for(int i = 0;i < n; i++)
    {
        a[i] = i;
        b[i] = i+10;
    }
    add(a,b,res,n);
    for(int i=0;i<n;i++) cout<<res[i]<<" ";
    cout<<endl;
    return 0;
}

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Welcome to [SO]. It's hard to help you debug your code if we cannot fully compile it. You should add enough code [mcve] so that someone else doesn't have to create other bits to test it. As it stands it's not going to be worth the time to test your code out. In this case, can you also ensure that your CPU supports AVX2 and AVX512. :) – Ahmed Masud Jun 16 '19 at 19:06
  • 3
    Probably your CPU doesn't support AVX512. Use `g++ -march=native` to enable everything your CPU supports. If you get compile errors, your CPU does *not* support AVX512. – Peter Cordes Jun 16 '19 at 19:17
  • what lets you think you can use this instruction on your CPU? – OznOg Jun 16 '19 at 19:19
  • 1
    @PeterCordes I get compile errors with it. So, my cpu doesnt support AVX512. Thanks! – prajjwal_jha Jun 16 '19 at 19:22

1 Answers1

6

Probably your CPU doesn't support AVX512 at all.
Only CPUs of these and newer generations support AVX-512:

  • Zen 4 (and later presumably).

  • Server/workstation: Skylake-SP ("Xeon Scalable Performance") and later,
    Skylake-X high-end desktop/workstation.

  • Client: Ice Lake and later e.g. i5-1035G4, and Rocket Lake desktop, e.g. i5-11600.
    (Also the very-limited-release Cannon Lake laptop chip)
    Celeron / Pentium versions of these have AVX2 but not AVX-5121.

    Not Alder Lake (12th gen); Intel regressed their AVX-512 support, and are actively blocking people from using the AVX-512 support in the silicon, which was initially usable with the E-cores disabled.

  • Xeon Phi compute cards, 2nd gen and later (Knight's Landing).


Compiler options

Use clang or g++ -O3 -march=native to enable everything your CPU supports.

If you get compile errors (like undeclared function _mm512_loadu_ps), your CPU does not support AVX512 so g++ didn't enable it, so immintrin.h wouldn't define that intrinsic.

(Or another possible error is error "inlining" a builtin that target options don't allow.)

Only use separate -mavx512f and -mtune= options if you want to make a binary for other CPUs, not just the machine you're compiling on.

Related: How to test AVX-512 instructions w/o supported hardware?

MSVC and ICC do let you use intrinsics without telling the compiler the target supports them, so this method of checking your code against the CPU doesn't work with those compilers. They'll happily let you compile code that won't run on the current CPU. (Because MSVC assumes you're going to do runtime CPU detection and dispatching, instead of distributing source code for everyone to optimize for their own machine.)


More about CPUs without AVX-512

Intel processor name/number meanings

AMD hasn't yet released any AVX-512 CPUs (rumours point to Zen4), and older Intel also lack it.
Skylake-client does not have AVX-512, only Skylake-server.
Intel Alder Lake hybrid (big.LITTLE) CPUs won't have AVX-512, only AVX2 even on the big cores.
Low-power CPUs like Silvermont / Tremont don't even have AVX1.

Also note, there are multiple extensions to AVX-512, like AVX-512VPOPCNTDQ that introduces SIMD instructions to count set bits in each SIMD element. Check Wikipedia's CPUs with AVX-512 table to see which CPU has what. AVX-512F is the "foundation", and AVX-512VL allows using cool new instructions on 128 and 256-bit vectors.

Footnote 1: Pentium/Celeron versions of older Intel CPUs don't even have AVX, just SSE4.2. (Also lacking BMI1/2 because they disabled decoding of VEX prefixes).

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847