7

According to the ARM ARM, __ARM_NEON__ is defined when Neon SIMD instructions are available. I'm having trouble getting GCC to provide it.

Neon available on this BananaPi Pro dev board running Debian 8.2:

$ cat /proc/cpuinfo | grep neon
Features    : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt 

I'm using GCC 4.9:

$ gcc --version
gcc (Debian 4.9.2-10) 4.9.2

Try GCC and -march=native:

$ g++ -march=native -dM -E - </dev/null | grep -i neon
#define __ARM_NEON_FP 4

OK, try what Google uses for Android when building for Neon:

$ g++ -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=softfp -dM -E - </dev/null | grep -i neon
#define __ARM_NEON_FP 4

Maybe a ARMv7-a with a hard float:

$ g++ -march=armv7-a -mfloat-abi=hard -dM -E - </dev/null | grep -i neon
#define __ARM_NEON_FP 4

My questions are:

  • why am I not seeing __ARM_NEON__?
  • how do I detect Neon availability in the preprocessor?

And maybe:

  • what GCC switches should I use to enable Neon SIMD instructions?

Related, on a LeMaker HiKey, which is AARCH64/ARM64 running Linaro with GCC 4.9.2, here's the output from the preprocessor:

$ cpp -dM </dev/null | grep -i neon
#define __ARM_NEON 1

According to ARM, this board does have Advanced SIMD instructions even though:

$ cat /proc/cpuinfo 
Processor   : AArch64 Processor rev 3 (aarch64)
...
Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32
jww
  • 97,681
  • 90
  • 411
  • 885
  • 2
    `-mfpu=neon`, or maybe `-mfpu=neon-vfpv4`. – EOF May 05 '16 at 12:24
  • Thanks @EOF. I need to double check the Android build flags to see why its not being used for AOSP toolchains (or maybe it is and my notes are incomplete/broken). – jww May 05 '16 at 14:05

1 Answers1

17

There are a number of questions hidden in here, I'll try to extract them in turn...

According to the ARM ARM, __ARM_NEON__ is defined when Neon SIMD instructions are available. I'm having trouble getting GCC to provide it.

That is compiler documentation for [an old version of] the ARM Compiler rather than the ARM Architceture Reference Manual. A better macro to check for the presence of the Advanced SIMD instructions would be __ARM_NEON, which is defined in the ARM C Language Extensions.

Try GCC and -march=native:

As you may have found. GCC for the ARM target separates out -march (For the architecture revision for which GCC should generate code), -mfpu (For the floating point/Advanced SIMD unit available) and -mfloat-abi (For how floating point arguments should be passed, and for the presence or absence of a floating point unit). Finally there is -mtune (Which asks GCC to try to optimise for a particular processor) and -mcpu (which acts as a combination of -mtune and -march).

By asking for -march=native You're asking GCC to generate code appropriate for the detected architecture of the processor on which you are running. This has no impact on the -mfpu setting, and so does not necessarily enable Advanced SIMD instruction generation.

Note that the above only applies to a compiler targeting AArch32. The AArch64 GCC does not support -mfpu and will detect presence of Advanced SIMD support through -march=native.

OK, try what Google uses for Android when building for Neon:

$ g++ -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=softfp -dM -E

These build flags are not sufficient to enable support for Advanced SIMD instructions, your notes may be incomplete. Of the -mfpu flags supported by GCC 4.9.2 I'd expect any of:

neon, neon-fp16, neon-vfpv4, neon-fp-armv8, crypto-neon-fp-armv8

To give you what you want.

According to ARM, this board does have Advanced SIMD instructions even though:

Looks like you're running on an AArch64 kernel, which exposes support for Advanced SIMD through the asimd feature - as in your example output.

Community
  • 1
  • 1
James Greenhalgh
  • 2,401
  • 18
  • 17
  • 1
    Thanks James. This should help bridge some of my knowledge gaps. – jww May 05 '16 at 18:05
  • 1
    Newer gcc no longer defines `__ARM_NEON`. Older gcc (like 5.4) defines both `__ARM_NEON` and `__ARM_NEON__`. gcc5.4 and gcc6.3 define `__ARM_FEATURE_SIMD32 1`. https://godbolt.org/g/LmPjDq Is that what one should check? – Peter Cordes Sep 07 '17 at 22:40
  • 3
    The godbolt compilers you've linked look like they are configured in different ways, (confirmed with -v). You can enable Neon support by adding `-mfloat-abi=hard` to your command line. You might want to report this configuration difference as a bug with the owner of the service. `__ARM_FEATURE_SIMD32` refers to a different part of the instruction set - those that operate on packed 8-bit values in the 32-bit registers, so is not a replacement for `__ARM_NEON`. – James Greenhalgh Sep 12 '17 at 09:45