The multiple of 4 bytes is only for esp? or is it related to every register?
Note that
sub esp, N
doesn't access any memory location, the use is related to memory alignment but the instruction itself is a simple register-immediate subtraction, it could use any value.
For performance reason if you read 16 bits they should be on an address multiple of 2, 32 bits should be on an address multiple of 4.
This is called natural boundary alignment.
32 bits systems can only push
/pop
16 or 32 bits values, if we only use multiple of 4 in instructions like sub esp, N
, the push
/pop
access data aligned on their natural boundaries (note that 4 is multiple of 2).
Data on the stack is also accessed directly with instructions like
mov [ebp-04h], eax
The principle here is the same, EBP is a multiple of 4 (note that its value is the old ESP value, before the subtraction) so the 32 bits data is stored in address multiple of 4 (naturally aligned).
The natural alignment of bytes is... 1. Meaning that they should be at address multiple of 1, i.e. everywhere.
That's why mov [ebp-01h], 'A'
performs as mov [ebp-04h], 'A'
.
Trivia
As rule of thumbs IA32e General Purpose instructions can read/write from bytes to qwords at every address.
The whole alignment story is mostly for performance reasons, unlike RISC machines where they cannot structurally access unaligned data.
When initially introduced SSE instructions came with fast "aligned" (like movaps
) and slow "unaligned" (like movups
) versions of the same instruction.
64 bits systems now explicitly require 128 bits alignment of the stack to better perform with vector instructions (and widened registers).
The CPU has a bit in the EFLAGS register, the bit AC, that let a program enable or disable a strict alignment policy (à la RISC), supposed the OS has enabled this feature (setting AM in CR0).
Aligning data more strictly that the CPU data bus (for whatever definition of it on modern integrated DRAM controller) is pointless.
That's why new ABIs align on 128 bits even the CPU can have 512 bits registers.
Alignment requirement for every instruction can be found on the Manual 2 (the complete set).