are there a document for this kind of behavior?
This is not really considered behavior, but rather a clever yet well known code sequence to shorten composition of immediates, used by assemblers and compilers.
The only behavior of the processor is sign extension of the 12-bit immediate in all I-Type instructions.
The reason the designers do this is a combination of two things:
That they want to allow for negative immediates for instructions like addi
, as well as for lw
and sw
deeming that negative offsets are sufficiently useful, as they can be used for frame pointer relative arithmetic to access local variables, or reaching the header of a block that immediately precedes the block, among other things.
And further, they want the hardware to have only one kind of 12-bit extension, namely signed extension.
These two points, taken together, mean that lui
and one of: addi
, lw
, sw
, can accomplish full 32-bit addresses / values, all working the same: sign extension of the second instruction may require incrementing the constant used for the lui
.
They didn't have to architect it this way; for example, they could have provided an another instruction addui
that clears the upper 20 bits before adding; or, they could have provided versions of lw
and sw
that do the same, or defined lw
and sw
to support only 12-bit unsigned immediates.
But what they chose was a compromise to both allow negative immediates in general, and otherwise the simpler of the hardware alternatives.
The designers have gone to some lengths to simplify the hardware with consideration for embedded and otherwise power/size limited processors
why 2 and -2048?
To avoid the sign extension feature of addi with negative 12-bit numbers, you would have to limit immediates to 11 bits unsigned, which would leave the 12th bit, sign bit, as zero, and thus would not be negative in 12 bits, so would never extend a negative sign. For example, 0x400 fits in 11 bits, so with that we can do:
lui x7, 1
addi x7, x7, 0x400
addi x7, x7, 0x400
achieving 0x1000 + 0x400 + 0x400 = 0x1800.
However, as you can see that involves three instructions!
To shorten the code sequence, we must take advantage of the extra 12th (sign) bit, even though it is going to be set/on/true/1/negative, and will cause -1 value for the upper 20 bits of the immediate before use by the addi
.
That -1 (of the upper 20 bits caused by sign extension of the 12 bit immediate) needs to be offset by +1 (of the upper 20 bits) to obtain the desired number, and that +1 offset is done in the lui
instruction, hence lui x7, 2
instead of 1, and addi x7, x7, 0x800
to accomplish the 2 instruction sequence. 0x800 taken as a signed 12-bit number is -2048, so: 2 and -2048: 0x2000=8192; 8192 + -2048 = 6144; 6144=0x1800.