Clarifying three different addressing modes in x86

Question

I am trying to reconcile the differences between the main categories of addressing memory in x86, and want to see if I have the distinctions right. If I'm understanding things correctly, there are three broad ways, each with their own syntax.

The following examples will use the src for the different address methods:

Literal/Immediate value.
- For example, to move the decimal value 10 into %eax:
```
mov $10, %eax
```
Direct Register addressing.
- Direct: For example, to move the value in %ebx into %eax:
```
mov %ebx, %eax
```
- Indirect: This uses the format discussed in #3.
Offset/Indexed/Indirect addressing:
- For example, relative to a register:
```
mov -4(%ebp), %eax
```
- For example, relative to a label or address:
```
mov string(,%edi, 4), %eax
```

But, my main question here is that the three main categories cannot be used interchangeably.

For example, we cannot use an immediate value in offset addressing, such as:

mov $2(%edi), %eax

Or, when doing direct register addressing, we cannot use an offset such as:

mov %eax(,%edi,2), %eax

Is that a correct understanding of the three main forms of memory addressing, or are there some things that I am missing here?

`-4(%ebp, 4)` is a syntax error. It has a `4` in the "index" position, not the scale. The index has to be a register or empty. Anyway no, there's only one general form of memory addressing mode, the rest are subsets of it. `(%eax, %edi, 2)` is legal, you just put the base in the wrong place. — Peter Cordes, Aug 25 '20 at 04:26
@PeterCordes if there's only one mode then, and the calculation is `address + %base + %index * scale`, wouldn't `%ebx` and `(%ebx)` produce the same value then in that calculation? Also, how would `$1` be a subset of it? — carl.hiass, Aug 25 '20 at 04:30
I said there's one general form of *memory* addressing mode. As you say, immediate and register-direct are distinct, using either a different opcode (immediate) or a different "mode" encoding in modrm to take the register value, rather than memory at that address. — Peter Cordes, Aug 25 '20 at 04:32
@PeterCordes oh, got it! Sorry I misunderstood that sentence then. — carl.hiass, Aug 25 '20 at 04:34
It wasn't the clearest reply to your question, that's on me. Now that I see what you're asking, you simply got the syntax wrong for `2(%edi)` - it's a displacement in the addressing mode not an immediate. But yes you can use a numeric literal instead of a symbol. Also, did you mean for `%eax(,%edi,2)` to do something other than reference memory at `[eax + edi*2]`? Like add memory contents to a register value to, like `add 0(,%edi,2), %eax`? If that's just another syntax mistake with the base in the displacement position then there was no misunderstanding, just syntax. — Peter Cordes, Aug 25 '20 at 04:34
@PeterCordes right the last two examples should have incorrect syntax - I was just trying to give a made-up example for someone with more knowledge to say something along the lines of "that's correct you cannot use an immediate with an offset" or something like that, if that makes sense. — carl.hiass, Aug 25 '20 at 04:37
By the way thanks for the duplicate links. These are both great help: https://stackoverflow.com/questions/34058101/referencing-the-contents-of-a-memory-location-x86-addressing-modes, and from that, http://www.sig9.com/articles/att-syntax, — carl.hiass, Aug 25 '20 at 04:38
Well ok, but you *can* use a numeric literal there, it's just called a displacement instead of an immediate, and you don't use a `$`. In AT&T syntax, the `$` applies to the whole argument, not just the number. e.g. `add $2 + 3, %eax` is the same as `add $5, %eax`, you don't need to and *can't* write it as `add $2 + $3, %eax`. So `$2(%eax)` is obviously invalid because the whole expression needs to be an assemble-time constant. (Or a symbol address, like `$symbol`, which only becomes a known constant at link time.) — Peter Cordes, Aug 25 '20 at 04:39
Also related: [Do terms like direct/indirect addressing mode actual exists in the Intel x86 manuals](https://stackoverflow.com/q/46257018) - yes the only real distinctions are immediate vs. register vs. memory with some addressing mode. Some people like to put names on `(%eax)` like register-indirect, but other than x86-64 RIP-relative, all memory forms are just subsets of the general case. Unlike on some other ISAs where there's more difference, like different instructions that allow different modes. — Peter Cordes, Aug 25 '20 at 04:54

Clarifying three different addressing modes in x86

0 Answers0