From Anger Fog's C++ optimization manual, I read:
The code for accessing a data member is more compact if the offset of the member relative to the beginning of the structure or class is less than 128 because the offset can be expressed as an 8-bit signed number. If the offset relative to the beginning of the structure or class is 128 bytes or more then the offset has to be expressed as a 32-bit number (the instruction set has nothing between 8 bit and 32 bit offsets). Example:
// Example 7.40
class S2 {
public:
int a[100]; // 400 bytes. first byte at 0, last byte at 399
int b; // 4 bytes. first byte at 400, last byte at 403
int ReadB() {return b;}
};
The offset of b is 400 here. Any code that accesses b through a pointer or a member function such as ReadB needs to code the offset as a 32-bit number. If a and b are swapped then both can be accessed with an offset that is coded as an 8-bit signed number, or no offset at all. This makes the code more compact so that the code cache is used more efficiently. It is therefore recommended that big arrays and other big objects come last in a structure or class declaration and the most often used data members come first. If it is not possible to contain all data members within the first 128 bytes then put the most often used members in the first 128 bytes.
I have tried this and I see no difference in the assembly output of this test program, as shown here:
class S2 {
public:
int a[100]; // 400 bytes. first byte at 0, last byte at 399
int b; // 4 bytes. first byte at 400, last byte at 403
int ReadB() { return b; }
};
// Changed order of variables a and b!
class S3 {
public:
int b; // 4 bytes. first byte at 400, last byte at 403
int a[100]; // 400 bytes. first byte at 0, last byte at 399
int ReadB() { return b; }
};
int main()
{
S3 s3; s3.b = 32;
S2 s2; s2.b = 16;
}
The output is
push rbp
mov rbp, rsp
sub rsp, 712
mov DWORD PTR [rbp-416], 32
mov DWORD PTR [rbp-432], 16
mov eax, 0
leave
ret
Clearly, mov DWORD PTR
is used for both cases.
- Can someone explain why this is?
- Can someone explain what is meant by "the instruction set has nothing between 8 bit and 32 bit offsets" (I'm new to ASM) and what this statement suggests that I should be seeing in the ASM?