8086 instruction set: MODR/M byte

Question

The 8086 documentation sites seem a bit vague when the MODR/M byte is mentioned and it's really difficult to comprehend what it is and does.

What are all the bits used for in the MODR/M byte and what are the possible options?

Some documentation I've found: https://www.scs.stanford.edu/05au-cs240c/lab/i386/s17_02.htm

The ModR/M byte contains three fields of information:
The mod field, which occupies the two most significant bits of the byte, combines with the r/m field to form 32 possible values: eight registers and 24 indexing modes
The reg field, which occupies the next three bits following the mod field, specifies either a register number or three more bits of opcode information. The meaning of the reg field is determined by the first (opcode) byte of the instruction.
The r/m field, which occupies the three least significant bits of the byte, can specify a register as the location of an operand, or can form part of the addressing-mode encoding in combination with the field as described above

What is an indexing mode? What is a register number? How is a register represented? etc.

[Intel's own PDFs](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html#inpage-nav-2) are pretty clear, as are detailed sites like https://wiki.osdev.org/X86-64_Instruction_Encoding#16-bit_addressing (that page covers 16-bit ModRM, so it's not *just* talking about x86-64 long mode.) Modern x86 uses the same instruction encoding (in 16-bit real mode) as 8086; that backwards compat is the whole point of x86, and why it's so nasty. And of course you can get find PDF copies of the actual 8086 manual itself, and [The 8086 Primer](https://stevemorse.org/8086/) — Peter Cordes, Jun 11 '22 at 01:35
The 8086 primer from page 23 onwards covers instruction encoding of operands. It's written as a book, not just a technical manual. It's available for free on Stephen Morse's web site (https://stevemorse.org/8086/), the guy who designed it when he was at Intel. — Peter Cordes, Jun 11 '22 at 01:40
@PeterCordes All I'm asking for is a simple but thorough explanation without the cryptic nature of "https://wiki.osdev.org/X86-64_Instruction_Encoding#16-bit_addressing" and the overly complicated nature of "https://stevemorse.org/8086/". The first one is alright until is goes to explain the REG segment of the byte where it becomes extremely vague. The second one looks promising but references to the reg/opcode part are rare and, again, vague but at least it actually says something about the possible options that the MOD part of the byte offers. — Mauser_Maschine, Jun 11 '22 at 02:28
I'd recommend Intel's official SDM PDFs, then. The intro chapters of vol.2 explain the x86 machine-code format. There are 8 registers, so a 3-bit field can code for one of them. That's a register number. In case Intel doesn't explain that basic computer-architecture concept. — Peter Cordes, Jun 11 '22 at 03:29

score 3 · Answer 1 · answered Jun 11 '22 at 06:46

Intel's own PDF manuals document this in detail; see vol.2 of the SDM, specifically the intro chapters before the entries for each instruction.

There are also detailed descriptions on various sites like https://wiki.osdev.org/X86-64_Instruction_Encoding#ModR.2FM_and_SIB_bytes (which covers 16-bit ModRM, so it's not just talking about x86-64 long mode.) Modern x86 uses the same instruction encoding (in 16-bit real mode) as 8086; that backwards compat is the whole point of x86, and why it's so nasty.

And of course you can get find PDF copies of the actual 8086 manual itself, in case that's more helpful to omit stuff that's only relevant for other modes.

The 8086 primer from page 23 onwards covers instruction encoding of operands. It's written as a book, not just a technical manual. It's available for free on Stephen Morse's web site (https://stevemorse.org/8086/), the guy who designed it when he was at Intel.

But maybe it would help to describe the basic overview of the purpose of ModRM, so you know what to look for in those docs.

ModR/M purpose and basics

Most (but not all) x86 instructions have one ModRM byte. It can code for 2 operands, up to one of them being memory, or both registers. e.g. add cx, ax, or add cx, [bx+si].

The opcode itself determines which of the r/m and r operands are the source and/or destination, or whether the /r field acts as extra opcode bits. (e.g. for shifts, that's why they can't copy-and-shift, or use a count register other than CL.) add [bx+si], cx has the same ModRM byte as add cx, [bx+si] but a different opcode.

The register-only operand is code by the 3-bit /r field. 3 bits can code for any of x86's 8 general-purpose registers. This is a "register number", like in any normal ISA with 2^n registers, groups of n bits in each instruction code for register operands.

The r/m operand can also be a register, but the 2-bit "mode" field determines whether the 3-bit r/m field is a register number (mod=0b11) or whether it's a memory addressing mode. (Plus an 8 or 16-bit displacement, so coding for a disp0/8/16 uses up the other 3 encodings of the mode field.)

https://wiki.osdev.org/X86-64_Instruction_Encoding#ModR.2FM_and_SIB_bytes shows the fields and interpretation for 16-bit address-size, including register numbers.

So there are only 3 bits to specify a register or combination of registers for the memory address. 386 added an escape code for a SIB byte, allowing a full selection of addressing modes like [eax + ecx*4], but 8086 (and 16-bit address-size on any CPU) must be some subset of [BX|BP] + [SI|DI] + disp0/8/16.

See Differences between general purpose registers in 8086: [bx] works, [cx] doesn't? / Why don't x86 16-bit addressing modes have a scale factor, while the 32-bit version has it?

Examples from assembling foo.asm and then ndisasm -b16 foo, or from asking NASM itself to make a listing with nasm -l/dev/stdout foo.asm. Then editing to simplify the output fields.

 00 00           add [bx+si],al      ; opcode=0x00 (add byte, mem dst)  mod=00 r=000 r/m=000

 01 C0           add ax, ax          ; add r/m, r   mod=11 (register) r=000 (AX) r/m=0 (AX)
 01 08           add [bx+si], cx     ; add r/m, r
 03 08           add cx, [bx+si]     ; mod=0, r=001 (CX) r/m=000 ([bx+si])

 03 0F           add cx, [bx]        ; mod=00 r=001 (CX) r/m=111 ([BX])
 03 4F 04        add cx, [bx + 4]    ; mod=01 r=001 (CX) r/m=111  disp8=4

 01 F2           add dx, si          ; mod=11 r=110 (SI) r/m=010 (DX)

To create more examples, use an assembler to create machine code yourself.

8086 instruction set: MODR/M byte

1 Answers1

ModR/M purpose and basics

Linked

Related