I'm trying to assemble a file that uses ARM's CRC instruction. The assembler is producing an error Error: selected processor does not support 'crc32b w1,w0,w0'
.
There are runtime checks in place, so we are safe with the instruction. The technique works fine on i686 and x86_64. For example, I can assemble a file that uses Intel CRC intrinsics or SHA Intrinsics without -mcrc
or -msha
(and on a machine without the features).
Here is the test case:
$ cat test.cxx
#include <arm_neon.h>
#define GCC_INLINE_ATTRIB __attribute__((__gnu_inline__, __always_inline__, __artificial__))
#if defined(__GNUC__) && !defined(__ARM_FEATURE_CRC32)
__inline unsigned int GCC_INLINE_ATTRIB
CRC32B(unsigned int crc, unsigned char v)
{
unsigned int r;
asm ("crc32b %w2, %w1, %w0" : "=r"(r) : "r"(crc), "r"((unsigned int)v));
return r;
}
#else
// Use the intrinsic
# define CRC32B(a,b) __crc32b(a,b)
#endif
int main(int argc, char* argv[])
{
return CRC32B(argc, argc);
}
And here is the result:
$ g++ test.cxx -c
/tmp/ccqHBPUf.s: Assembler messages:
/tmp/ccqHBPUf.s:23: Error: selected processor does not support `crc32b w1,w0,w0'
Placing the ASM code in a source file and compiling with different options is not feasible because CRC32B
will be used in C++ header files, too.
How do I get GAS to assemble the instruction?
GCC's configuration and options are the reason we are trying to do things this way. User's don't read manuals, so they won't add -march=armv8-a+crc+crypto -mtune=cortex-a53
to CFLAGS
and CXXFLAGS
.
In addition, distros compile to a "least capable" machine, so we want the hardware acceleration routines available. When the library is provided by a distro like Linaro, both code paths (software CRC and hardware accelerated CRC) will be available.
The machine is a LeMaker HiKey, which is ARMv8/Aarch64. It has an A53 processor with CRC and Crypto (CRC and Crypto is optional under the architecture):
$ cat /proc/cpuinfo
Processor : AArch64 Processor rev 3 (aarch64)
processor : 0
...
processor : 7
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: AArch64
GCC lacks most of the usual defines one expects to be present by default:
$ g++ -dM -E - </dev/null | sort | egrep -i '(arm|neon|aarch|asimd)'
#define __aarch64__ 1
#define __AARCH64_CMODEL_SMALL__ 1
#define __AARCH64EL__ 1
Using GCC's -march=native
does not work on ARM:
$ g++ -march=native -dM -E - </dev/null | sort | egrep -i '(arm|neon|aarch|asimd)'
cc1: error: unknown value ‘native’ for -march
And Clang:
$ clang++ -dM -E - </dev/null | sort | egrep -i '(arm|neon|aarch|asimd)'
#define __AARCH64EL__ 1
#define __ARM_64BIT_STATE 1
#define __ARM_ACLE 200
#define __ARM_ALIGN_MAX_STACK_PWR 4
#define __ARM_ARCH 8
#define __ARM_ARCH_ISA_A64 1
#define __ARM_ARCH_PROFILE 'A'
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_DIV 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_UNALIGNED 1
#define __ARM_FP 0xe
#define __ARM_FP16_FORMAT_IEEE 1
#define __ARM_FP_FENV_ROUNDING 1
#define __ARM_NEON 1
#define __ARM_NEON_FP 0xe
#define __ARM_PCS_AAPCS64 1
#define __ARM_SIZEOF_MINIMAL_ENUM 4
#define __ARM_SIZEOF_WCHAR_T 4
#define __aarch64__ 1
GCC version:
$ gcc -v
...
gcc version 4.9.2 (Debian/Linaro 4.9.2-10)
GAS version:
$ as -v
GNU assembler version 2.24 (aarch64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.24