Can the program counter on Intel CPUs can be read directly (that is without 'tricks') in kernel mode or some other mode?
-
1Related: [Why can't you set the instruction pointer directly?](https://stackoverflow.com/questions/8333413/why-cant-you-set-the-instruction-pointer-directly/41150027#41150027). You can; the instruction is called `jmp`. My answer there explains why x86 machine code was designed so you can't encode a `mov` to `eip`, the way you can on ARM where PC is one of the general-purpose integer registers. – Peter Cordes Mar 23 '18 at 11:23
7 Answers
No, EIP / IP cannot be accessed directly, but in position-dependent code it's a link-time constant so you can use a nearby (or distant) symbol as an immediate.
mov eax, nearby_label ; in position-dependent code
nearby_label:
To get EIP or IP in position-independent 32-bit code:
call _here
_here: pop eax
; eax now holds the PC.
On CPUs newer than Pentium Pro (or PIII probably), call rel32
with rel32=0 is special-cased to not affect the return-address predictor stack. So this is efficient as well as compact on modern x86, and is what clang uses for 32-bit position-independent code.
On old 32-bit Pentium Pro CPUs, this would unbalance the call/return predictor stack, so prefer calling a function that does actually return, to avoid branch mispredicts on up to 15 or so future ret
instructions in your parent functions. (Unless you're not going to return, or so rarely that it doesn't matter.) The return-address predictors stack will recover, though.
get_retaddr_ppro:
mov eax, [esp]
ret ; keeps the return-address predictor stack balanced
; even on CPUs where call +0 isn't a no-op.
In x86-64 mode, RIP can be read directly using a RIP-relative lea
.
default rel ; NASM directive: use RIP-relative by default
lea rax, [_here] ; RIP + 0
_here:
MASM or GNU .intel_syntax
: lea rax, [rip]
AT&T syntax: lea 0(%rip), %rax

- 328,167
- 45
- 605
- 847

- 30,433
- 12
- 89
- 114
-
Ah, I love that trick. In ARM, there are pipelining issues when reading the PC. Is this issue present with Intel CPU's as well? – strager Mar 01 '09 at 15:51
-
No, it is not present on x86. (Btw - the ARM pipelining issue is crazy) – Nils Pipenbrinck Mar 01 '09 at 15:57
-
In x86, you have the cost of a branch, which is far worse for the pipeline than just reading the PC. PIC on ARM is far cheaper than x86, even with the pipeline oddities. – Serafina Brocious Mar 01 '09 at 16:01
-
9This code actually screws up the return value branch prediction and slows you down quite a lot. I'll try to find a reference for this... – Adam Rosenfield Mar 01 '09 at 16:25
-
Still trying to find a reference, but the keyword is "return address stack", which is a stack of return addresses the processor uses to predict the branch targets of RET instructions. Since CALL pushes EIP onto the RAS, you're screwing up the RAS. – Adam Rosenfield Mar 01 '09 at 16:32
-
You could fix that issue by pushing a label on the stack after you pop the eip, and then doing a ret (which would jump to the label you pushed). – SoapBox Mar 03 '09 at 01:30
-
2Don't know why this is the accepted answer over TrayMan's answer. TrayMan's version has no unexpected side effects and is shorter. – Skizz Jun 26 '09 at 08:30
-
1http://www.ptlsim.org/Documentation/html/node31.html has a good description on "return address stack" – mfazekas Apr 23 '10 at 12:22
-
@Cody Brocious: the cost of a branch need not be "far worse for the pipeline" in that the code at the address is most likely in the prefetcher. In this case the decoder will see the call and its address well in advance and be able to plan for them. On the other hand, a typical call to somewhere else is most likely not in the cache and costly for that reason. In a tight loop you will typically see few performance issues with predictable jumps. There are code examples in intel manuals that show how to make conditional jumps "predictable" – Olof Forshell Mar 09 '11 at 01:51
-
-
5Re: _This code actually screws up the return value branch prediction ... I'll try to find a reference for this..._ - the reference is "Intel's 64-ia-32 optimization manual" -> 3.4.1.4 Inlining, Calls and Returns -> "_The return address stack mechanism augments the staticand dynamic predictors to optimize specifically for calls and returns. It holds 16 entries, which is large enough to cover the call depth of most programs. ... To enable the use of the return stack mechanism, calls and returns must be matched in pairs_" – Xtra Coder Dec 16 '13 at 06:36
-
actually in x86_64 RIP-relative addressing is available so you it's accessible using many instructions like `lea rax, [rip]` – phuclv Jul 06 '14 at 07:34
-
Can we do the same trick using inline asm and visula c++ ? I found EAX become null after POP EAX instruction :( – Dev.K. Jan 18 '15 at 14:25
-
1@AdamRosenfield it's here https://blogs.msdn.microsoft.com/oldnewthing/20041216-00/?p=36973/ – phuclv Dec 06 '16 at 06:54
-
1@AdamRosenfield: turns out that all modern x86 CPUs special-case `call +0` to not break the return address stack! So this common idiom is not terrible after all. http://blog.stuffedcow.net/2018/04/ras-microbenchmarks/#call0. – Peter Cordes May 23 '18 at 11:58
If you need the address of a specific instruction, usually something like this does the trick:
thisone:
mov (e)ax,thisone
(Note: On some assemblers this might do the wrong thing and read a word from [thisone], but there's usually some syntax for getting the assembler to do the right thing.)
If your code is statically loaded to a specific address, the assembler already knows (if you told it the right starting address) the absolute addresses of all instructions. Dynamically loaded code, say as a part of an application on any modern OS, will get the right address thanks to address relocation done by the dynamic linker (provided the assembler is smart enough to generate the relocation tables, which they usually are).

- 7,180
- 3
- 24
- 33
-
I tried this on x86-64 with inline gcc assembly, but got `error in backend: 32-bit absolute addressing is not supported in 64-bit mode` using llvm 6.1.0. Is that LLVM's problem, or not possible in 64-bit mode? – csl Sep 15 '15 at 21:35
-
2@csl: either you're on OS X (where there's no workaround other than using `lea rax, [thisone]`), or you're on Linux making a shared object and thus this can't work either. (`mov`-immediate requires a link-time constant address, so it only works in position-dependent code). But if you're making a Linux executable, [your compiler might default to making a position-independent executable, and `-no-pie -fno-pie` make a position-dependent executable where you can use absolute addressing](https://stackoverflow.com/questions/43367427/32-bit-absolute-addresses-no-longer-allowed-in-x86-64-linux). – Peter Cordes Nov 19 '17 at 21:34
On x86-64 you can do for example:
lea rax,[rip] (48 8d 05 00 00 00 00)

- 30,738
- 21
- 105
- 131

- 4,014
- 3
- 23
- 29
-
-
thats the instruction encoding - theres an implicit 32-bit offset which is 0, i'm not sure if there is a shorter encoding – matja Jun 26 '09 at 17:59
-
2`lea rax, [rip]` did not work in NASM 2.10. It seems that RIP can only be used indirectly with `rel` as in `lea rax, [rel _start]`? – Ciro Santilli OurBigBook.com May 10 '15 at 21:31
There is no instruction to directly read the instruction pointer (EIP) on x86. You can get the address of the current instruction being assembled with a little inline assembly:
// GCC inline assembler; for MSVC, syntax is different
uint32_t eip;
__asm__ __volatile__("movl $., %0", : "=r"(eip));
The .
assembler directive gets replaced with the address of the current instruction by the assembler. Note that if you wrap the above snippet in a function call, you'll just get the same address (within that function) every time. If you want a more usable C function, you can instead use some non-inline assembly:
// In a C header file:
uint32_t get_eip(void);
// In a separate assembly (.S) file:
.globl _get_eip
_get_eip:
mov 0(%esp), %eax
ret
This means each time you want to get the instruction pointer, it's slightly less efficient since you need an extra function call. Note that doing it this way does not blow the return address stack (RAS). The return address stack is a separate stack of return addresses used internally by the processor to facilitate branch target prediction for RET instructions.
Every time you have a CALL instruction, the current EIP gets pushed onto the RAS, and every time you have a RET instruction, the RAS is popped, and the top value is used as the branch target prediction for that instruction. If you mess up the RAS (such as by not matching each CALL with a RET, as in Cody's solution), you're going to get a whole bunch of unnecessary branch mispredictions, slowing your program down. This method does not blow the RAS, since it has a matched pair of CALL and RET instructions.

- 1
- 1

- 390,455
- 97
- 512
- 589
-
Many thanks for the info, I didn't know there were two stacks.. :) – Liran Orevi Mar 01 '09 at 18:02
-
1The RAS is an internal stack used by the processor; it's not accessible to code in any way. It's only used for branch target prediction. Without it, code would still function correctly, just more slowly. – Adam Rosenfield Mar 01 '09 at 18:08
-
Thanks you so much. Does RAS mess up if you manually set the ESP after Push? – Liran Orevi Mar 08 '09 at 08:11
-
Or to just get the address of the current function `uintptr_t func_addr = (uintptr_t)&function_name;`. On x86 C/C++ implementations, function pointers are simply code addresses. This will assemble to whatever is needed. – Peter Cordes Nov 19 '17 at 21:52
There is an architecture independent (but gcc dependent) way of accessing the address which is being executed by using labels as values:
http://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html
void foo()
{
void *current_address = $$current_address_label;
current_address_label:
....
}

- 31
- 1
-
2
-
1
-
-
@PeterMortensen: IDK what anyone can useful do with the address of some code inside a C function, but this seems just as good as the `asm("movl $., %0", : "=r"(eip))` answer (if not better, because this one will work in PIC code, too. In that case, the compiler will have to use a RIP-relative LEA or a 32-bit PIC method of getting static addresses, but it will match EIP when execution passes through that label, unless you've copied code around in a way that broke static addressing.) – Peter Cordes Nov 19 '17 at 21:56
You can also read this from /proc/stat. Check the proc manpages.

- 1,367
- 14
- 25
-
I think you mean `/proc/self/stat`, it would also be cool to quote the manpage. – Ciro Santilli OurBigBook.com May 12 '15 at 08:21
-
1`/proc/self/stat` there is a field according to `man proc(5)`: `kstkeip %lu `: The current EIP (instruction pointer). – Paul Praet May 13 '15 at 09:29
-
Can you expand this answer? E.g. including some context (is this for Linux only?) and consider the information in the other comments. – Peter Mortensen Nov 19 '17 at 20:53
-
1If you do this the "normal" way, won't you always get the address of the `syscall` instruction in libc's `read()` function? Doesn't seem useful for finding out your own EIP/RIP value. – Peter Cordes Nov 19 '17 at 21:29
There is a simple way to change the program counter (eip)
When you call a function with 'call' the eip is pushed in the stack then when you ret the eip is just poped from the stack . so , all what you have to do is to push the value you want and then ret . for example:
mov eax, 0x100
push eax`
ret
and it's done.

- 83
- 6
-
1That's an answer to a different question (which already exists: [Why can't you set the instruction pointer directly?](//stackoverflow.com/q/8333413)). IDK why you'd push/ret though, instead of simply `jmp eax` – Peter Cordes Feb 05 '20 at 01:50