Memory addressing mode interpretation for x86 on Linux

Question

I am reading through Programming from the ground up by Jonathan Bartlett. The author discusses memory addressing mode and states that the general form of memory address reference is this:

ADDRESS_OR_OFFSET (%BASE_OR_OFFSET, %INDEX, MULTIPLIER)

where the final address is calculated thus:

FINAL_ADDRESS = ADDRESS_OR_OFFSET + %BASE_OR_OFFSET + MULTIPLIER * %INDEX.

It is also stated that if any of the pieces is left out, it is just substituted with zero in the equation. ADDRESS_OR_OFFSET and MULTIPLIER are required to be constants, while the other elements are required to be registers. These seem to be the only general rules specified.

So far, so good.

The author then discusses indirect addressing mode and gives as example:

movl (%eax), %ebx

which moves the value at address stored in the eax register into ebx register.

For that to work, (%eax) should be interpreted as 0(%eax,0,0) as opposed to 0(0,%eax,0). Is there an additional rule that enforces this interpretation?

@MichaelPetch Hmm..so, the rule that "if any of the pieces is left out, it is just substituted with 0 in the equation" is not quite true then. — Tryer, Jan 06 '19 at 11:21
I think your book is seeking to establish rules that aren't exactly what's going on. I'm pretty sure that your MOV would translate to [%eax(0,0,0)]. The Intel instruction reference would be a better place to look for rules of this sort. — David Hoelzer, Jan 06 '19 at 11:22
Related: [Referencing the contents of a memory location. (x86 addressing modes)](https://stackoverflow.com/q/34058101) lists all the variations on addressing modes that x86 supports. — Peter Cordes, Jan 06 '19 at 14:27

fuz · Accepted Answer · 2019-01-06T14:32:16.087

The explanation in the book is not 100% correct. The x86 architecture has the following 32 bit addressing modes:

$imm                         immediate     result = imm
%reg                         register      result = reg
disp(%reg)                   indirect      result = MEM[disp + reg]
disp                         direct        result = MEM[disp]
disp(%base, %index, %scale)  SIB           result = MEM[disp + base + index * scale]

In the SIB (scale/index/base) and indirect addressing modes, disp can be left out for a 0 byte displacement. In SIB addressing mode, additionally base and index can be left out for 0 scale, 0 index; scale cannot be left out actually. Note that when I say “leave out,” only the value is left out; the comma is left in. For example, (,,1) means “SIB operand with no displacement, no base, no index, and 1 scale.”

In 64 bit mode, a rip-relative addressing mode is additionally available:

disp(%rip)                   rip relative  result = MEM[disp + rip]

This addressing mode is useful for writing position-independent code.

16 bit modes have different addressing modes, but they don't really matter so I'm not going to elaborate on them.

So for your example: this is understandable easily because it's actually an indirect addressing mode, not a SIB addressing mode with eax as the register and no displacement.

score 0 · Answer 2 · answered Jun 02 '22 at 18:19

I'm also reading this book and noticed that the code examples are slightly different from others you may find on the internet. This is because:

The syntax for assembly language used in this book is known at the AT&T syntax. It is the one supported by the GNU tool chain that comes standard with every Linux distribution. However, the official syntax for x86 assembly language (known as the Intel® syntax) is different.

About the question, I have found more information here:

The base, index and displacement components can be used in any combination, and every component can be omitted; omitted components are excluded from the calculation above. If index register is missing, the pointless scale factor must be omitted as well.

Memory addressing mode interpretation for x86 on Linux

2 Answers2