30

The Wikipedia article about x86 assembly says that "the IP register cannot be accessed by the programmer directly."

Directly means with instructions like mov and add, the same way we can read and write EAX.

Why not? What is the reason behind this? What are the technical restrictions?


There are special instructions like jmp to set it, and call to push the old value before setting a new one. (And in x86-64, read with LEA using a RIP-relative addressing mode.) See Reading program counter directly for details.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847

4 Answers4

34

You can't access it directly because there's no legitimate use case. Having any arbitrary instruction change eip would make branch prediction very difficult, and would probably open up a whole host of security issues.

You can edit eip using jmp, call or ret. You just can't directly read from or write to eip using normal operations

Setting eip to a register is as simple as jmp eax. You can also do push eax; ret, which pushes the value of eax to the stack and then returns (i.e. pops and jumps). The third option is call eax which does a call to the address in eax.

Reading can be done like this:

call get_eip
  get_eip:
pop eax ; eax now contains the address of this instruction
Polynomial
  • 27,674
  • 12
  • 80
  • 107
  • 1
    Yes, that's correct. You'll often see stuff like `jmp [eax]` or `jmp [esp+4]` used to implement dynamic calls or call tables. – Polynomial Nov 30 '11 at 22:02
  • Do you have a citation, or are you just making an educated guess? – Rob Kennedy Nov 30 '11 at 22:04
  • @RobKennedy - For which part? – Polynomial Nov 30 '11 at 22:12
  • For the first sentence, where you claim what the reason is. – Rob Kennedy Nov 30 '11 at 22:16
  • @RobKennedy - I don't know of any direct citation I could make, other than common sense. An arbitrary write to `eip` would invalidate the instruction cache (you can't cache instructions you can't predict) and it would make branch prediction impossible (you can't predict what you don't know). As for the security issues, limiting the instruction types that can perform a jump to potential shellcode is an obvious benefit. It also limits the potential for corrupt or buggy code from trashing data in kernel code (ring0) where there's no memory virtualisation. – Polynomial Nov 30 '11 at 22:29
  • 1
    Wouldn't `mov eax, offset get_ip` work? How about `mov eax, $`? Admittedly, it's been quite some time since I wrote any assembly language stuff . . . – Jim Mischel Nov 30 '11 at 22:30
  • @JimMischel - No. `offset get_ip` would give you the offset from the current segment, not the virtual address. If you've got ASLR enabled, this is an unreliable measure of the actual addresses. I'm guessing `mov eax, $` is a dereference (I don't recognise the syntax), which would jump to the address in memory at the address specified. For example, if the memory address `00501234` contains `CDAB4000` and `eax` is `00501234`, when you execute `jmp [eax]` it will dereference `eax` (which is `00501234`) and get `CDAB4000`, which represents `0040ABCD` in little endian, which it will jump to. – Polynomial Nov 30 '11 at 22:37
  • `mov eax, $` won't jump anywhere. It will load the `eax` register with, I think, the address of the current instruction. As I recall, `jmp $` is the same thing as `idiotLoop: jmp idiotLoop`. Not that I ever wrote anything like that . . . *cough* – Jim Mischel Nov 30 '11 at 22:54
  • `$` is just a compiler macro. It will still get compiled to some actual assembly, e.g. `mov eax, $` -> `call dummy; dummy: pop eax`. – Polynomial Nov 30 '11 at 22:59
  • I happened to stumble upon the RIP relative addressing mode for 64 bit platforms, which makes Platform Independent Code much more efficient than 32-bit and was wondering that we have a legitimate use case for definitely supporting the reading of eip. Any thoughts on why reading an eip is also disabled? – Chethan Ravindranath Jul 11 '12 at 16:07
  • @ChethanRavindranath That's an interesting question. I think the answer is that the number of use cases for *directly* reading eip are slim, or at least were when x86-32 was conceived, especially when you consider that eip is provided automatically (e.g. in the stack) for situations such as exceptions, interrupts and context switches. – Polynomial Jul 11 '12 at 17:56
  • @Polynomial: Makes sense! I guess they could have fixed it later, but this problem in 32-bit adds another reason for shifting to 64 bit computing! A bit philosophical thought though... Thanks! – Chethan Ravindranath Jul 11 '12 at 18:52
  • @Polynomial, actually, I think this could have made dynamic recompliers/optimizers/JITers much simpler, but there are probably workarounds. – Leeor Aug 25 '15 at 11:49
  • WRONG: your code for getting eip is wrong. you should use mov eax, [esp] – Amir Aug 25 '15 at 12:10
  • 1
    @Amir Read it again; it is correct. The `call` pushes the instruction pointer to the top of the stack, and the `pop eax` moves the instruction pointer into `eax` and reverts the stack to its correct position. Notice that there is no `ret`; the `call` instruction is not intended to be used as a call in this case, so leaving the return pointer on the stack would unbalance it, and cause an infinite loop (or crash) upon the next `ret`. – Polynomial Aug 25 '15 at 12:20
  • @Amir One can, of course, use `mov eax, [esp]` to read the top item in the stack, but then you would still have to rebalance the stack by popping later or using `add esp, 4` to shift off the superfluous return pointer. – Polynomial Aug 25 '15 at 12:21
  • don't you use ret in the end of your function? so how you set eip after? by jump? – Amir Aug 25 '15 at 13:08
  • @Amir You seem to be mixing up getting and setting eip. You can move eip into a register using the call/pop trick, and you can set it using push/ret (or `mov [esp], eax; ret`, or `jmp eax` if you like). They're two separate things. – Polynomial Aug 25 '15 at 14:54
  • 11
    Your first paragraph is bogus. [**ARM has its program-counter totally exposed for read/write as R15**](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473f/Babbdajb.html). ARM64 dropped that, but it didn't make ARM32 impossible. Part of branch prediction needs to happen before instructions are even decoded, to avoid fetch bubbles. At decode time, detecting that EIP is a destination register and marking it as a branch is not particularly hard. There are no security implications, because security doesn't depend on scanning the instruction stream to detect branch instructions. – Peter Cordes Dec 14 '16 at 18:25
  • 7
    Any explanation regarding instruction cache, branch prediction and other fancy stuff looks fishy to me for a simple reason: x86 was born as a microcontroller architecture, which had none of these frills. It's not like they took away ip access because it made moving to a superscalar architecture difficult - it simply wasn't there from the beginning. Probably they didn't add it because there's already `jmp` to set it, and there was no compelling enough use case to either add a specific instruction to read it or to steal precious bits from the mod-reg-rm byte of general purpose instructions. – Matteo Italia Dec 14 '16 at 19:05
  • 2
    this isn't a very good reasoning; there is a very real use case, that is being an alternative to labels. without being able to read eip directly and save it, you need to count bytes. with ability to get the instruction pointer without formulating it in terms of relative "call". – Dmytro Jan 21 '17 at 23:13
  • 1
    @Dmitry: A writeable EIP (with instructions other than `jmp`) isn't necessary, and you don't want it using up one of your 8 general-purpose register encodings. But being able to read it efficiently would have been nice for position-independent code. Fortunately x86-64 fixed that with RIP-relative addressing, including `lea rax, [rip]`, otherwise `call`/`pop` is usually best. (Fun fact: `call +=` (destination = next instruction) is a special case, and doesn't unbalance the return-address predictor stack. http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/#call0, so call/pop isn't bad.) – Peter Cordes Jul 16 '18 at 05:35
  • after a bit more thinking, I found that my hatred of jumps was largely unprecedented since as long as the condition on which jumps decide whether to jump or not are in a register already, or if there is no condition and the jump/call is to a block that is relatively near to current address, it works incredibly fast; fast enough to warrant trees or even absurd chains of if statements over the O(1) dictionary lookups. so performance wise, it's hard to use jump/call enough for it to become noticably better than a hypothetical fetch/store of eip. It'd help readability (continued) – Dmytro Jul 18 '18 at 22:23
  • but if readability is that big of a deal, one can always create their own macro to allow accessing eip, which would do that via call to a local label or call relative offset instruction. – Dmytro Jul 18 '18 at 22:24
18

That would have been a possible design for x86. ARM does expose its program counter for read/write as R15. That's unusual, though.

It allows a very compact function prologue/epilogue, along with the ability to push or pop multiple registers with a single instruction: push {r5, lr} on entry, and pop {r5, pc} to return. (Popping the saved value of the link register into the program counter).

However, it makes high-perf / out-of-order ARM implementations less convenient, and was dropped for AArch64.


So it's possible, but uses up one of the registers. 32-bit ARM has 16 integer registers (including PC), so a register number takes 4 bits to encode in ARM machine code. Another register is almost always tied up as the stack pointer, so ARM has 14 general-purpose integer registers. (LR can be saved to the stack, so it can be and is used as a general-purpose register inside function bodies).

Most of modern x86 is inherited from 8086. It was designed with fairly compact variable-length instruction encoding, and only 8 registers, requiring only 3 bits for each src and dst register in the machine code.

In the original 8086, they were not very general-purpose, and SP-relative addressing isn't possible in 16-bit mode, so essentially 2 registers (SP and BP) are tied up for stack stuff. This leaves only 6 somewhat-general purpose registers, and having one of them be the PC instead of general-purpose would be a huge reduction in available registers, greatly increasing the amount of spill/reload in typical code.


AMD64 added r8-r15, and the RIP-relative addressing mode. lea rsi, [rip+whatever], and RIP-relative addressing modes for direct access to static data and constants, is all you need for efficient position-independent code. Indirect JMP instructions are totally sufficient for writing to RIP.

There isn't really anything to be gained by allowing arbitrary instructions to be used to read or write the PC, since you can always do the same thing with an integer register and an indirect jump. It would be almost pure downside for x86-64's R15 to be the same thing as RIP, especially for the architecture's performance as a compiler target. (Hand-written asm weird stuff was already very much an uncommon niche thing by 2000, when AMD64 was designed.)

So AMD64 is really the first time that x86 could plausibly have gained a fully-exposed program counter like ARM, but there were many good reasons not to do that.

ecm
  • 2,583
  • 4
  • 21
  • 29
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    Related: [Is it possible to manipulate the instruction pointer in 8086 assembly?](https://stackoverflow.com/questions/47301935/is-it-possible-to-manipulate-the-instruction-pointer-in-8086-assembly/47374394#47374394): yes, write it with `jmp`, read it with `call`. – Peter Cordes Mar 06 '18 at 02:13
  • 1
    It turns out that `call +0` is fine, and doesn't unbalance the return-address predictor, so `call`/`pop` is actually best. http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/#call0. – Peter Cordes Jul 16 '18 at 05:41
  • update/footnote: ARM isn't the only ISA with PC as one of the "general purpose" registers: PDP-11 did that, too: https://en.wikipedia.org/wiki/PDP-11_architecture#CPU_registers . In very early machines, it makes some sense that 8 was more register than you wanted to build, so instead of 8 GPRs and a separate PC, you make PC addressable. And a pipelined implementation probably wasn't even in sight when PDP-11 was designed. – Peter Cordes Jan 11 '23 at 15:27
4

jmp will set the EIP register.

this code will set eip to 00401000:

mov eax, 00401000
jmp eax ;set Eip to 00401000

and for getting EIP

call GetEIP
.
.
GetEIP:
mov eax, [esp]
ret
Amir
  • 1,638
  • 19
  • 26
  • 1
    and how do you do this without using labels without counting bytes or writing your own higher language that automates counting bytes? – Dmytro Jan 21 '17 at 23:15
  • @Dmitry: You have to know where you're jumping, so either you need an absolute numeric address, or you need to use labels. (Or count bytes, but seriously just use local labels, that's what they're for.) – Peter Cordes Feb 27 '18 at 07:15
  • that's a false dichotomy; assembly knows many ways to jump, such as those listed here: https://c9x.me/x86/html/file_module_x86_id_147.html and while they're not supported by any assembler i know of(or aren't easy to find in documentation), you can force them by creating a macro that defines the bytes inline of code, eg `db 0xeb, 0x0` for near relative jump to current ip. if assemblers knew how to `sizeof(nop;nop;nop;nop)` at preprocessor level, we could calculate offset inline to avoid counting errors too. – Dmytro Feb 28 '18 at 18:29
  • 1
    It turns out that `call +0` is fine, and doesn't unbalance the return-address predictor, so `call`/`pop` is actually best. http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/#call0. – Peter Cordes Jul 16 '18 at 05:40
4

I think they meant that the IP register cannot be accessed directly in the same way the other registers are accessed. Programmers can definitely write to IP, for example by issuing a jump instruction.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523