Why are RBP and RSP called general-purpose registers?

Question

According to Intel, in x64, the following registers are called general-purpose registers (RAX, RBX, RCX, RDX, RBP, RSI, RDI, RSP and R8-R15).

In the article Understanding C by learning assembly, it's written that RBP and RSP are special-purpose registers (RBP points to the base of the current stack frame and RSP points to the top of the current stack frame).

Now I have two contradictory statements. The Intel statement should be the trusted one, but what is correct and why are RBP and RSP called general purpose at all?

You can use both as general purpose registers, meaning the usual arithmetic and logical instructions work with them just fine. `rbp` is pretty much general purpose, the frame pointer thing is just convention. — Jester, Apr 10 '16 at 12:12
Every register has some special-ness (except R8-R15), for some instructions. For RSP, it's special for `push`/`pop`/`call`/`ret`, so most code never uses it for anything else. But in controlled conditional (like no signal handlers) you don't *have* to use it for a stack pointer. e.g. you can use it to read an array in a loop with `pop`, like [in this code-golf answer](https://codegolf.stackexchange.com/questions/133618/extreme-fibonacci/135618#135618). (I actually used `esp` in 32-bit code, but same difference). — Peter Cordes, Jul 29 '17 at 12:25
I guess if you extend the definition of "specialness" to encoding, even `r13` is a bit special, although it isn't really functional in that you can still effectively use every addressing mode (even if the assembly is sometimes putting in a hidden zero displacement for you). — BeeOnRope, Jul 30 '17 at 15:26
RBP can be used for general purposes with [`-fomit-frame-pointer`](https://stackoverflow.com/q/14666665/995714). It's harder for RSP though[ — phuclv, May 20 '18 at 04:40
@BeeOnRope: `r12` is also special (needs a SIB like RSP), except that it can be an index register. I wrote an answer with all this + \@phuclv's interesting point about `r11`, although that doesn't really affect compiler code-gen. — Peter Cordes, Jul 15 '18 at 10:24
`rip` is not general purpose, since you can't use the same `add` and `sub` instructions to alter it. While `jmp 16` is effectively the same as adding 16 to `rip`, you aren't allowed to actually write `add rip,16` and have it work. — puppydrum64, Dec 08 '22 at 17:16
The first link is (effectively) broken: It redirects to a generic page. — Peter Mortensen, Feb 24 '23 at 10:49

jlliagre · Accepted Answer · 2017-07-30T08:23:41.290

39

General purpose means all of these registers might be used with any instructions doing computation with general purpose registers while, for example, you cannot do whatever you want with the instruction pointer (RIP) or the flags register (RFLAGS).

Some of these registers were envisioned to be used for specific use, and commonly are. The most critical ones are the RSP and RBP.

Should you need to use them for your own purpose, you should save their contents before storing something else inside, and restore them to their original value when done.

edited Jul 30 '17 at 08:23

answered Apr 10 '16 at 12:38

jlliagre

29,783
6
61
72

3

Some compilers have the option to not use frame pointers, in which case RBP becomes a general purpose computer. – rcgldr Apr 11 '16 at 00:26
13

It's wroth noting that the use of `rpb` as a frame pointer is essentially entirely _convention_ and doesn't really have any CPU support (indeed, the Windows 64 ABI lets you use any register as a frame pointer and doesn't prefer `rbp`). This is very different than `rsp` which is tightly bound to its function at the hardware level since it is implicitly used by `push`, `pop` and friends. – BeeOnRope Jul 29 '17 at 21:31
11

@BeeOnRope The LEAVE and ENTER instructions specifically support using RBP as a frame pointer. RBP when used as a base is also SS relative like RSP, rather than DS relative like the others. The ENTER instruction and separate data and stack segments aren't used in modern x86 code, but compilers still generate LEAVE instructions. The fact that RBP can't be used as a base without a displacement also means that it's normally the best register to use as a frame pointer. It's not tightly bound like RSP, but the x86 instruction set favours using RBP as the frame pointer. – Ross Ridge Jul 30 '17 at 00:04
@ross - you are totally correct! I knew about `enter` and `leave`, but forgot! I didn't even know about the SS-relative part. Is that latter one still true in x64? – BeeOnRope Jul 30 '17 at 00:26
2

@BeeOnRope: Yes, it's still true, but pretty much irrelevant. IIRC, x86-64 requires SS and DS (and ES and CS) to have base=0. IDK what other stuff you can put in a segment descriptor that would matter, in some hypothetical OS. Only FS and GS still more or less have full functionality. – Peter Cordes Jul 30 '17 at 01:04
3

@BeeOnRope As Peter Cordes said, the SS segment is still used when RBP and RSP are used as a base, but it makes little difference in practice. The only difference I know of is that a non-canonical address will generate a stack fault instead of general protection fault if the SS segment is used instead one of the other segment registers. – Ross Ridge Jul 30 '17 at 05:28
Makes sense. These days I think we can more or less lump `rbp` in with the other non-numbered "general purpose" registers in that it has some specific hardcoded uses in certain relatively uncommon instructions, but can also be used generally. The use of `rsp` on the other hand is almost always still for its architecturally defined purpose, outside of some very extreme optimization cases. – BeeOnRope Jul 30 '17 at 15:22

Peter Cordes · Answer 2 · 2022-10-10T23:49:05.890

If a register can be an operand for add, or used in an addressing mode, it's "general purpose", as opposed to registers like the FS segment register, or RIP. The GP registers are also called "integer registers", even though other kinds of registers can hold integers, too.

In computer architecture, it's common for CPUs to internally handle integer registers / instructions separately from FP/SIMD registers / instructions. e.g. Intel Sandybridge-family CPUs have separate physical register files for renaming GP integer vs. FP/vector registers. These are simply called the integer vs. FP register files. (Where FP is short-hand for everything that a kernel doesn't need to save/restore to use the GP registers while leaving user-space's FPU/SIMD state untouched.) Each entry in the FP register file is 256 bits wide (to hold an AVX ymm vector), but integer register file entries only have to be 64 bits wide¹.

But when we say "integer register", we normally mean specifically a general-purpose register.

Note 1: Actually, a typical design is for integer PRF entries have room for a FLAGS result and/or a GP register, so maybe 70 bits. Since integer instructions also write FLAGS, it makes sense to keep them together instead of allocating from a separate table of tiny registers. (The register allocation table would then just have 2 extra entries, one for CF and one for the rest of the FLAGS, the SPAZO group, to record which PRF entry each part comes from.) On CPUs that rename segment registers (Skylake does not), I guess those would go in an integer PRF entry.

As far as the integer part of the architectural state of a user-space task that a kernel would save/restore on interrupts and system calls, that would include its RFLAGS and RIP. (And usually just not touch FP state.)

"General purpose" in this usage means "data or address", as opposed to an ISA like m68k where you had d0..7 data regs and a0..7 address regs, all 16 of which are integer regs. Regardless of how the register is normally used, general-purpose is about how it can be used.

Every register has some special-ness for some instructions, except some of the completely new registers added with x86-64: R8-R15. These don't disqualify them as General Purpose The (low 16 of the) original 8 date back to 8086, and there were implicit uses of each of them even in the original 8086.

For RSP, it's special for push/pop/call/ret, so most code never uses it for anything else. (And in kernel mode, used asynchronously for interrupts, so you really can't stash it somewhere to get an extra GP register the way you can in user-space code: Is ESP as general-purpose as EAX?)

But in controlled conditional (like no signal handlers) you don't have to use RSP for a stack pointer. e.g. you can use it to read an array in a loop with pop, like in this code-golf answer. (I actually used esp in 32-bit code, but same difference: pop is faster than lodsd on Skylake, while both are 1 byte.)

Implicit uses and special-ness for each register:

See also x86 Assembly - Why is [e]bx preserved in calling conventions? for a partial list.

I'm mostly limiting this to user-space instructions, especially ones a modern compiler might actually emit from C or C++ code. I'm not trying to be exhaustive for regs that have a lot of implicit uses.

rax: one-operand [i]mul / [i]div / cdq / cdqe, string instructions (stos), cmpxchg, etc. etc. As well as special shorter encodings for many immediate instructions like 2-byte cmp al, 1 or 5-byte add eax, 12345 (no ModRM byte). See also codegolf.SE Tips for golfing in x86/x64 machine code.

There's also xchg-with-eax which is where 0x90 nop came from (before nop became a separately-documented instruction in x86-64, because xchg eax,eax zero-extends eax into RAX and thus can't use the 0x90 encoding. But xchg rax,rax can still assemble to REX.W=1 0x90.)
rcx: shift counts, rep-string counts, the slow loop instruction
rdx: rdx:rax is used by divide and widening-multiply (the one-operand forms), and cwd / cdq / cqo to set up for idiv. Also rdtsc and BMI2 mulx.
rbx: 8086 xlatb. cpuid use all four of EAX..EDX. 486 cmpxchg8b, x86-64 cmpxchg16b. Most 32-bit compilers will emit cmpxchg8 for std::atomic<long long>::compare_exchange_weak. (Pure load / pure store can use SSE MOVQ or x87 fild/fistp, though, if targeting Pentium or later.) 64-bit compilers will use 64-bit lock cmpxchg, not cmpxchg8b.

Some 64-bit compilers will emit cmpxchg16b for atomic<struct_16_bytes>. RBX has the fewest implicit uses of the original 8, but lock cmpxchg16b is one of the few compilers will actually use.
rsi/rdi: string ops, including rep movsb which some compilers sometimes inline. (gcc also inlines rep cmpsb for string literals in some cases, but that's probably not optimal).
rbp: leave (only 1 uop slower than mov rsp, rbp / pop rbp. gcc actually uses it in functions with a frame pointer, when it can't just pop rbp). Also the horribly-slow enter which nobody ever uses.
rsp: stack operations: push/pop/call/ret, and leave. (And enter). And in kernel mode (not user space) asynchronous use by hardware to save interrupt context. This is why kernel code can't have a red-zone.
r11: syscall/sysret use it to save/restore user-space's RFLAGS. (Along with RCX to save/restore user-space's RIP).

Addressing-mode encoding special cases:

(See also rbp not allowed as SIB base? which is just about addressing modes, where I copied this part of this answer.)

rbp/r13 can't be a base register with no displacement: that encoding instead means: (in ModRM) rel32 (RIP-relative), or (in SIB) disp32 with no base register. (r13 uses the same 3 bits in ModRM/SIB, so this choice simplifies decoding by not making the instruction-length decoder look at the REX.B bit to get the 4th base-register bit). [r13] assembles to [r13 + disp8=0]. [r13+rdx] assembles to [rdx+r13] (avoiding the problem by swapping base/index when that's an option).

rsp/r12 as a base register always needs a SIB byte. (The ModR/M encoding of base=RSP is escape code to signal a SIB byte, and again, more of the decoder would have to care about the REX prefix if r12 was handled differently).

rsp can't be an index register. This makes it possible to encode [rsp], which is more useful than [rsp + rsp]. (Intel could have designed the ModRM/SIB encodings for 32-bit addressing modes (new in 386) so SIB-with-no-index was only possible with base=ESP. That would make [eax + esp*4] possible and only exclude [esp + esp*1/2/4/8]. But that's not useful, so they simplified the hardware by making index=ESP the code for no index regardless of the base. This allows two redundant ways to encode any base or base+disp addressing mode: with or without a SIB.)

r12 can be an index register. Unlike the other cases, this doesn't affect instruction-length decoding. Also, it can't be worked around with a longer encoding like the other cases. AMD wanted AMD64's register set to be as orthogonal as possible, so it makes sense they'd spend a few extra transistors to check REX.X as part of the index / no-index decoding. For example, [rsp + r12*4] requires index=r12, so having r12 not fully generally purpose would make AMD64 a worse compiler target.

   0:   41 8b 03                mov    eax,DWORD PTR [r11]
   3:   41 8b 04 24             mov    eax,DWORD PTR [r12]      # needs a SIB like RSP
   7:   41 8b 45 00             mov    eax,DWORD PTR [r13+0x0]  # needs a disp8 like RBP
   b:   41 8b 06                mov    eax,DWORD PTR [r14]
   e:   41 8b 07                mov    eax,DWORD PTR [r15]
  11:   43 8b 04 e3             mov    eax,DWORD PTR [r11+r12*8] # *can* be an index

Compilers like it when all registers can be used for anything, only constraining register allocation for a few special-case operations. This is what's meant by register orthogonality.

Also register **DX** is special in instructions IN, OUT, INS, OUTS. — vitsoft, Sep 21 '18 at 19:02
@vitsoft: as I said, *I'm not trying to be exhaustive*, just to cover uses that are actually still relevant, especially for compiler-generated code. Only mentioning obscure uses if there's nothing else. — Peter Cordes, May 25 '20 at 09:22

Why are RBP and RSP called general-purpose registers?

2 Answers2

Implicit uses and special-ness for each register:

Linked