4

I needed to know if there is any restriction on loading code segment registers directly by using mov instructions .

This struck me while going through the switching from real mode to protected mode . I found that in order to put the correct value in the code segment "jump" instruction is used to set the correct segment .

So is this usage of jump instruction owing to any such restriction ? Why cannot we directly load a value into the code segment ?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • You can't modify _CS_ directly with a MOV instruction but you can set it with a FAR JMP or FAR CALL where you specify the segment to load into CS and the label to jump to. You can use a FAR CALL too. There are some other more convoluted way I won't mention. The syntax ofa FAR JMP depends on which assembler you use (you never mentioned it) – Michael Petch Sep 25 '18 at 04:21
  • If you look at an instruction set reference for [MOV](http://www.felixcloutier.com/x86/MOV.html) it has this statement _The MOV instruction cannot be used to load the CS register. Attempting to do so results in an invalid opcode exception (#UD). To load the CS register, use the far JMP, CALL, or RET instruction_ – Michael Petch Sep 25 '18 at 04:25

1 Answers1

10

Setting CS would be a jump, because code-fetch happens from CS:IP (or CS:RIP/EIP).

It makes sense that doing this is restricted to jmp far / call far / ret far and other control-transfer instructions.

Changing CS without changing IP would be weird: the next instruction to execute after a hypothetical mov cs, ax instruction would be new_CS_base:old_IP+2 (because mov cs,ax is 2 bytes long if you don't use an operand-size prefix.)

Sure you could set things up so you had code at the same IP offset relative to two different segment bases, but the fact that pop cs is a jump while pop ds isn't is just weird. Forcing you to set both CS and IP at the same time with a jmp seems pretty sane/normal to me.

Related: Is it possible to manipulate the instruction pointer in 8086 assembly?.
What is the purpose of CS and IP registers in Intel 8086 assembly?


Remember that 386 protected mode was an extension; in real mode the CS value was used directly as segment base = cs<<4. The use-case of loading a new descriptor with the same base was new with 386. (Actually with 286 protected mode). Before that there wasn't really a use-case for mov cs, r/m16 or pop cs opcodes, so Intel reserved those instruction encodings for other uses.

That simplified future CPUs by not having to support mov cs, r/m or pop cs as jump instructions that would have to discard prefetched code.

(In some early versions of 8086, pop cs did exist, following the same pattern as push/pop of other segment regs, and it had opcode 0x0f, but Intel wisely decided to reserve 0F for use as an escape byte for multi-byte opcodes in future x86 CPUs. What would happen if the CS segment register is changed? (And how would you do so?)).


Changing CS in protected mode is even less common than in real mode (mainstream OSes use a flat memory model), so there was definitely no need to start supporting mov to CS. jmp far works perfectly well, and better in fact because you don't need to ensure that the IP / EIP offset relative to the segment base is the same before/after.

As Margaret points out, the low 2 bits of the CS selector is the Current Privilege Level in 286 or 386 protected mode (like in x86-64 long mode), so it's normally only done when you want to execute different code, not continue executing whatever's next. That special case usually only comes up during the transition from real mode to protected mode, where you might wish you could load CS to get into 32-bit mode without changing where code-fetch comes from at all. But changing CS inherently will update the CS base address, so even allowing mov cs, reg wouldn't make that easier.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847