Understanding the SIB byte in x86 Assembly

Question

I have read every article google has shown about the SIB byte, as well as this video and the Intel Manuals, but still a bit unclear. This page is particularly helpful, with this content:

[ reg32 + eax*n ] MOD = 00
[ reg32 + ebx*n ] 
[ reg32 + ecx*n ]
[ reg32 + edx*n ]
[ reg32 + ebp*n ]
[ reg32 + esi*n ]
[ reg32 + edi*n ]

[ disp + reg8 + eax*n ] MOD = 01
[ disp + reg8 + ebx*n ]
[ disp + reg8 + ecx*n ]
[ disp + reg8 + edx*n ]
[ disp + reg8 + ebp*n ]
[ disp + reg8 + esi*n ]
[ disp + reg8 + edi*n ]

[ disp + reg32 + eax*n ] MOD = 10
[ disp + reg32 + ebx*n ]
[ disp + reg32 + ecx*n ]
[ disp + reg32 + edx*n ]
[ disp + reg32 + ebp*n ]
[ disp + reg32 + esi*n ]
[ disp + reg32 + edi*n ]

[ disp + eax*n ] MOD = 00, and
[ disp + ebx*n ] BASE field = 101
[ disp + ecx*n ]
[ disp + edx*n ]
[ disp + ebp*n ]
[ disp + esi*n ]
[ disp + edi*n ]

where n = 1, 2, 4 or 8.

From my understanding (though none of the docs seem to show this explicitly), the reg is the "base", the eax etc. are the index, and the n is the "scale" factor, as if you were doing this in an array in JavaScript:

base[index * scale]

The scale essentially allows you to jump by a multiple, so if your memory is byte-based, but your object is 32-bits, then you can jump by every 4 bytes using scale factor of 4. That type of thing.

Am I on the right track? I can't tell fully yet.

What I'm confused about is the displacement. How does the displacement play into the equation here? If you were to draw it as a JavaScript array... And in the case of the last batch of examples, like [ disp + eax*n ], there is no reg (what I'm considering the "base"). Does this mean the base is 0? Also, can the disp be another register, or can it only be a static hardcoded value? Finally, what are all the registers that can be used as the base and the index?

Also, what does it mean disp + reg8 + eax*n? Are we literally doing addition here?

It's just a double indexed addressing mode. You can use two arbitrary registers, one of which may be scaled, and a constant displacement. The three components are then added. Yes, it's a literal addition. Any of the three components (base, index, scale) may be omitted. — fuz, Jan 30 '21 at 14:00
"double indexed addressing mode", "You can use two arbitrary registers" (which ones exactly, which sizes). Does it matter what one is encoded in the base vs. the index? How big can the displacement be? Does the total after addition need to be under a certain size? — Lance, Jan 30 '21 at 14:12
An indexed addressing mode is any addressing mode that combines direct addressing (the address is given in an immediate displacement) and indirect addressing (the address is given in a register) to form the address from the sum of a register and a displacement. This is commonly used to index arrays, hence “indexed addressing mode.” “two arbitrary registers” → depending on the address size, these are all 32, or 64 bit registers. You can toggle the address size with a `67` prefix. The displacement is none, 1 byte, or 4 bytes. base vs. index doesn't matter but only index can be scaled. — fuz, Jan 30 '21 at 14:16
“Does the total after addition need to be under a certain size?” → no, the addition is performed to full 32 or 64 bit, depending on the address size. Note that all these explanations silently ignore 16 bit addressing modes since there is no SIB byte with these. — fuz, Jan 30 '21 at 14:21
Have you read [Referencing the contents of a memory location. (x86 addressing modes)](https://stackoverflow.com/q/34058101) which starts out by explaining the basic form of x86 addressing modes, and names the parts? And then shows examples of how they can be used. Also [rbp not allowed as SIB base?](https://stackoverflow.com/q/52522544) explains the "escape codes" that mean special things, like the code that would mean ESP as index actually means no index, so it's possible to encode `[esp]` instead of only `[esp + esp]`. — Peter Cordes, Jan 30 '21 at 17:00

Understanding the SIB byte in x86 Assembly

0 Answers0