Instructions with Long (32 and 64 bit) immediate operands in RISC processors

Question

Are operations with large immediate numbers possible in RISC processors, when the size of the immediate operand does not allow to place it in the 32-bit instruction word (standard for RISC architectures). Say we want to store a 32-bit or 64-bit immediate in a register, or execute a simple arithmetic instruction with an integer of that size.

Here are some examples in a pseudocode.

Let r1 and r2 are register names, and imm32 is an immediate operand of 32-bit length, while imm64 is an immediate operand of 64-bit size.

We want to execute the following instructions, written in pseudocode like:

1) r1 = imm32
2) r1 = imm64
3) r1 = r2 + imm32
4) r1 = r2 - imm64

Are such instructions possible on popular RISC platforms? For example, we can consider such well known architectures, as MIPS, RISC-V, SPARC, DEC Alpha64 (dead, but famous processor family), ARM and Power. If they are possible, what would be the code for these instructions and how are they kept in memory when a RISC-instruction word contains only 32 bits? If they aren't, what is the program simulation for the pseudocode, given above?

Please not that “big list” kind of questions are not too well received on this site as they are by definition open ended. I have written an answer that gives a general overview over the subject matter; if you want to know what it looks like on specific architectures, go read the instruction set references or ask a compiler. — fuz, Jul 26 '22 at 04:58
Note that this is your [second](https://stackoverflow.com/q/73109209/417501) “big list” kind of question. Do not ask questions in this manner. The effort to give the answer you demand is immense and is ultimately busywork that you should do yourself by reading the instruction set references in question. — fuz, Jul 26 '22 at 05:08
https://godbolt.org/ has compilers for most of those ISAs; write `long foo(long x){ return x+123456789; }` and see what you get. (`clang -target sparc` can give you SPARC asm). Try AArch64 with a repeating pattern like `0x1111111111111111` then go read about its immediate encodings. — Peter Cordes, Jul 26 '22 at 05:48

score 3 · Answer 1 · answered Jul 26 '22 at 05:14

Yes of course. Being RISC doesn't mean having fixed-length instructions. It simply means the architecture is load-store instead of being orthogonal and allow memory operands. ARM has the Thumb-2 extension for variable-length instructions of 16 and 32-bit. Similarly MIPS has the MIPS16e and microMIPS extensions

I don't know if they allow 48-bit instruction or not but RISC-V definitely does. It has been designed from the ground up to support any instruction length that's a multiple of 16. And using a 48-bit instruction we can easily squeeze a 32-bit immediate. Similarly a 64-bit immediate can be embedded in 80 bits or more, leaving lots of space for the opcode and everything else. Here's its instruction encoding:

See also

How does RISC-V variable length of instruction work in detail?
RISC-V Opens the Door on 48-bit Computing | Agam Shah, HPC Wire (yes, the title is ridiculously terrible - it has nothing to do with 48-bit architectures, only a possibility to have 48-bit instructions)

You're absolutely correct: "Yes of course. Being RISC doesn't mean having fixed-length instructions". Marco Bonelli and fuz are also correct (with answers that superficially might appear contradictory). The problem is the OP's question: a) it's a 'big list", and b) it's "vague". The OP needs to be more specific in order to get an accurate reply. — paulsm4, Jul 27 '22 at 20:10

score 1 · Answer 2 · answered Jul 26 '22 at 04:58

1

By the pidgeonhole principle, not all n bit constants will fit an n or less bit instruction word if that word also encodes other things. So no, RISC architectures with fixed instruction length cannot support arbitrary immediates.

To work around this, the immediate has to be first loaded into a register so it can then be used with a register-operand variant of the instruction. Some assemblers can do this automatically. The immediate may then still be to big to be loaded in one chunk. The following two techniques are commonly used to work around this:

load the immediate from a nearby literal pool using a pc relative addressing mode
load the immediate with a sequence of instructions where each instruction sets some bits of the immediate (e.g. one instruction loads the low 16 bit, the second sets the high 16 bit)

In modern RISC designs, approach (2) is usually recommended and the processor likely employs macro fusion to perform the two or more load-immediate instructions in one step.

Note that besides being able to work around this limitation, modern RISC architectures usually try to make the limited space for immediates as useful as possible. For example, ARM does not just store an immediate but rather an immediate and a rotation amount, permitting many commonly used constants and bit patterns (in case of Thumb2) to be used directly.

answered Jul 26 '22 at 04:58

fuz

88,405
25
200
352

The technique #1, you mentioned: 1. load the immediate from a nearby literal pool using a pc relative addressing mode. actually means a program counter (in other processors called instruction pointer) register used for indirect addressing in an indirect register mode. It seems this trick is absolutely impossible on Intel/AMD x86 platforms (both 32 and 64 bit). But I heard it was in favour in PDP-11. Strictly speaking PDP-11 had no immediate addressing mode, but it had several register indirect modes and PC register could be used in these modes. – JSpruce Jul 26 '22 at 05:53
Thus, PC when used for indirect addressing emulated immediate operand. The immediate operand was EMULATED in PDP-11. But I could not believe, this trick for emulating immediates is in use in RISC architectures. It contradicts so much the RISC philosophy, it's so unobvious. Perhaps it's not the rule, but the exclusion in the RISC world. Are there many RISC processors which use this trick? – JSpruce Jul 26 '22 at 05:53
2

@JSpruce Amd64 has a RIP-relative addressing mode, the other operating modes of the x86 architecture do not. RISC architectures as well as other modern architectures typically have such an addressing mode, too. It is very useful for all sorts of things, including writing position independent code. – fuz Jul 26 '22 at 05:55
3

@JSpruce: x86-64 has RIP-relative addressing (unlike 32-bit mode), but we don't normally use it for integer constants because x86-64 also have `mov r64, imm64`. Very common for FP and SIMD constants, though. Have you looked at compiler output for x86-64? [Why does this MOVSS instruction use RIP-relative addressing?](https://stackoverflow.com/q/44967075) . But anyway, that's x86, a CISC. A "literal pool" nearby is very common on ARM. – Peter Cordes Jul 26 '22 at 05:55
3

@JSpruce The literal pool approach is different from the PDP-11 approach where it's just a post-increment addressing mode on `PC`. What happens on RISC architectures is that you put the immediate into memory near the code and just fetch it with a PC-relative addressing mode. This is used widely on ARM, though I don't really know about other architectures (they do generally support it though). – fuz Jul 26 '22 at 05:56
2

To extend on Peter Cordes' response: the amd64 rip-relative addressing mode (rip is their program counter/“instruction pointer”) has a 32 bit displacement, so you usually put constants into the (ro)data section instead of nearby to improve cache utilisation. – fuz Jul 26 '22 at 05:58
@fuz: ARM didn't (until movw/movk) have a 2-instruction way to construct arbitrary 32-bit constants. It only had 8-bit-rotated immediates, so your other option was MOV + 3x OR, in the worst case if there's no pattern. MIPS/RISCV, PowerPC, and most other classic RISCs following the MIPS mold can construct immediates in 2 instructions, so 2 total words of size in the text section, with a load-upper-immediate with enough immediate bits to add up to 32 along with an add-immediate or load displacement. – Peter Cordes Jul 26 '22 at 06:00
1

Also, MIPS (and many other RISCs) only have one addressing mode, `imm16(register)`. And most don't have PC as one of their 32 GPRs, so they don't have PC-relative addressing directly. MIPS didn't have anything like AUIPC until much later than MIPS I, although I guess it could always have used `jal` to get a code address into LR, then an `lw`, so 3 total words including the data. (And can amortize the setup over multiple loads if necessary, for 1 instruction executed per 32-bit constant loaded.) – Peter Cordes Jul 26 '22 at 06:06

Marco Bonelli · Answer 3 · 2022-07-26T05:28:26.377

Are operations with large immediate numbers possible in RISC processors?

Nope, at least not in general, they usually aren't, as you might guess. This is because most RISC processors have a fixed instruction encoding length, and therefore cannot encode immediates larger than (or equal to) the length of the instruction itself.

In order to work with large immediates, you usually have pseudo-instructions that are understood by the assembler (but don't really exist) and then split into multiple instruction when assembling.

As an example, in MIPS 32-bit all instructions are 4 bytes long (32-bit), so even only using one bit for the opcode itself, you wouldn't be able to work with immediates larger than 31 bits. In fact, immediates in MIPS 32-bit can only take at most 16 of the 32 bits of an instruction. To load a large immediate into a register you have to use two instructions: LUI (Load Upper Immediate) plus another ALU instruction to modify the lower 16-bits. The assembler provides the LI (Load Immediate) pseudo-instruction, which is assembled into two instructions.

# Load 0x12345678 into $t0
LUI $t0, 0x1234
ORI $t0, $t0, 0x5678

Other architectures (notably ARM) often mix data with code in order to work with large immediates, using load instructions to get the immediate into a register, and using the instruction pointer to locate it:

ldr    r0, [pc, #0]
.short 0x0000
.word  0x12345678

And again the assembler might implement pseudo-instructions to do the same operation in a way that is clearer to understand by programmers, like in this case ldr r0, =IMMEDIATE using "unified assembler language" syntax that can encode both into Thumb and ARM instructions.

Instructions with Long (32 and 64 bit) immediate operands in RISC processors

3 Answers3