AVX2 (Advanced Vector Extensions 2) is an instruction set extension for x86. It adds 256bit versions of integer instructions (where AVX only provided 256b floating point).
AVX2 adds support for for 256-bit integer SIMD. Most existing 128-bit SSE instructions are extended to 256-bit. AVX2 uses the same VEX encoding scheme as AVX instructions.
See the x86 tag page for guides and other resources for programming and optimising programs using AVX2.
As with AVX, common problems are lack of VZEROUPPER
, and non-obvious data movement in shuffles, due to the 128b lanes design.
AVX2 also adds the following new functionality:
- Scalar -> Vector register broadcast
- Gather loads for loading a vector from different memory locations.
- Masked memory loads/stores
- New permute instructions
- Element-wise bit-shifting that allows each element of a vector to be shifted by a different amount.
The AVX2 instruction set was introduced together with FMA3 (3-operand Fused-Multiply Add) in 2013 with Intel's Haswell processor line. (AMD CPUs from Piledriver onwards support FMA3, but AVX2 support was not introduced then.)