I'm curious if the resulting compiled code will use Intel's AES-NI instructions?
Crypto++ 5.6.1 added support for AES-NI and Carryless Multiplies under GCM. It is used when two or three conditions are met. First, you are using a version of the library with the support. From the homepage under News (or the README):
Second, the compiler, assembler and the linker must support the instructions. For Crypto++, that means you use at least MSVC 2008 SP1, GCC 4.3, and Binutils 2.19. For MSVC, if you look at config.h
, its guarded as follows (__AES__
is there for GCC and friends, too):
#if ... (_MSC_FULL_VER >= 150030729) ...
#define CRYPTOPP_BOOL_AESNI_INTRINSICS_AVAILABLE 1
#else
#define CRYPTOPP_BOOL_AESNI_INTRINSICS_AVAILABLE 0
#endif
You can lookup _MSC_FULL_VER
numbers at Visual Studio version. Ironically, I've never seen a similar page on MSDN even though the service packs matter. You have to go to a Chinese site. For example, checked iterators showed up in VS2005 SP1 (IIRC).
For Linux and GCC compatibles, the GNUmakefile
checks the version of the compiler and assembler. If they are too old, then the makefile adds CRYPTOPP_DISABLE_AESNI
to the command line to disable the support even if __AES__
is defined.
CRYPTOPP_DISABLE_AESNI
shows up more often then you think. For example, if you download OpenBSD 6.0 (the current version), then
CRYPTOPP_DISABLE_AESNI
will be present because their assembler is so old. They are mostly stuck at the pre-GPL-2 version of their tools (apparently they did not agree to the license changes).
Third, the CPU supports both AES and SSE4 instructions (the reason for the SSE4 instructions is explained below). These checks are performed at runtime, and the function of interest is called HasAES()
from cpu.h
(there's also a HasSSE4()
):
//! \brief Determines AES-NI availability
//! \returns true if AES-NI is determined to be available, false otherwise
//! \details HasAESNI() is a runtime check performed using CPUID
inline bool HasAESNI()
{
if (!g_x86DetectionDone)
DetectX86Features();
return g_hasAESNI;
}
The caveat of Item (3) is the library needed to be compiled with support from Item (2). If Item (2) did not include compile time support, then Item (3) cannot offer runtime support.
With respect to Item (3) and runtime support, we recently had to tune it. It seems some low-end Atom processors, like D2500's, have SSE2, SSE3, SSSE3 and AES-NI, but not SSE4.1 or SSE4.2. According to Intel ARK, its an optional configuration of the processor. We received one bug report about an illegal SSE4 instruction in the AES-NI codepath, so we had to add an HasSSE4()
check. See PR 172, Check for SSE4 support before using SSE4.1 instruction.
And if so, what happens if the hardware (older CPU) does not support it?
Nothing. The default CXX implementation is used rather than the hardware accelerated AES.
You might be interested to know we also have other AES hardware acceleration, including ARMv8 Crypto and VIA Padlock. We also provide other hardware acceleration, like CRC32, Carryless-Multiplies and SHA. They all function the same way - compile time support is translated into runtime support.
(Comment): I just set a breakpoint on DetectX86Features method in cpu.cpp ... and it never triggered ...
This can be tricky for two reasons. First, the calls may be inlined in release builds so the code is shaped a little differently then you would expect.
Second, there's a global random number generator accessed by GlobalRNG()
. GlobalRNG()
is AES in OFB mode. When initializers run for the test.cpp
translation unit, the GlobalRNG()
is created which causes DetectX86Features()
to run very early (before control enters main
).
You may have better luck with observing the low level details with WinDbg.
Its also worth mentioning that AES/GCM can be sped up by interleaving AES with GCM. I believe the idea is to perform 4 rounds of AES key calculation and 1 CLMUL in parallel. Crypto++ does not take advantage of it, but OpenSSL takes the opportunity. I don't know what Botan or mbedTLS do.