JVM instructions - sload

Question

I guess it is a basic question, but why there is no sload instruction? Why can you load all primitives besides short? (There is saload, but still...)

For :

public class ShortTest {
    public void test() {
        short i = 1;
        System.out.print(i);
    }
}

Compiler still uses iload_1. Is it because short is 16 bit type and processors handle better 32bits (since all modern processors are 32/64bits)?

Related: https://stackoverflow.com/questions/14531235/in-java-is-it-more-efficient-to-use-byte-or-short-instead-of-int-and-float-inst — Mark Rotteveel, Aug 27 '19 at 11:30
Event with parse (short) istore, iload is used. There is no sload instruction at all. Question is why? — Noskol, Aug 27 '19 at 11:56
why there is no optimization done iload could be replaced by iconst_1 in this example — Lee, Aug 27 '19 at 13:10
@Lee it does use `iconst_1`, i.e. `iconst_1, istore_1, getstatic System.out, iload_1, invokevirtual PrintStream.print(I)V` It could eliminate the local variable, but why should it? The author has written the code with a local variable and the compiler keeps it. The runtime performance is not affected anyway. — Holger, Aug 27 '19 at 15:22
And that is what i say is not optimal why load const store it and reload const instead of use const directly. Ok the code says so but it could be optimized — Lee, Aug 28 '19 at 06:07

score 7 · Accepted Answer · edited Jun 20 '20 at 09:12

Refer to the JVM specification, §2.11.1. Types and the Java Virtual Machine:

Note that most instructions in Table 2.11.1-A do not have forms for the integral types byte, char, and short. None have forms for the boolean type. A compiler encodes loads of literal values of types byte and short using Java Virtual Machine instructions that sign-extend those values to values of type int at compile-time or run-time. Loads of literal values of types boolean and char are encoded using instructions that zero-extend the literal to a value of type int at compile-time or run-time. Likewise, loads from arrays of values of type boolean, byte, short, and char are encoded using Java Virtual Machine instructions that sign-extend or zero-extend the values to values of type int. Thus, most operations on values of actual types boolean, byte, char, and short are correctly performed by instructions operating on values of computational type int.

It’s worth recalling that in Java, any integer arithmetic not involving long will have an int result, regardless of whether the input is byte, char, short, or int.

So a line like

short i = 1, j = 2, k = i + j;

will not compile, but require a type cast, like

short i = 1, j = 2, k = (short)(i + j);

And this type cast will be the only indicator that short is involved. Letting debug hints aside, there is no formal declaration of local variables in bytecode, but only assignments of values which determine their type. So local variables of type short simply do not exist. The code above compiles to

     0: iconst_1
     1: istore_1
     2: iconst_2
     3: istore_2
     4: iload_1
     5: iload_2
     6: iadd
     7: i2s
     8: istore_3

which is identical to the compiled form of

int i = 1, j = 2, k = (short)(i + j);

But mind that the compile-time type of variables can change which method the compiler chooses for an invocation in case of overloads. Which is especially important if the types carry different semantics, like in the case of print(boolean) or print(char). While the value passed to the method has an int type in either case, the outcome is entirely different.

Another example of differences enforced by the compiler is

{
    int i = 1;
    i++;
}
{
    short s = 1;
    s++;
}

which gets compiled to

     0: iconst_1
     1: istore_1
     2: iinc          1, 1
     5: iconst_1
     6: istore_1
     7: iload_1
     8: iconst_1
     9: iadd
    10: i2s
    11: istore_1

So, since the calculation is always performed in 32 bit, the compiler inserts the necessary code to truncate the result to short for the second increment. Note again the absence of variable declarations, so the code is identical to the compiled form of

int i = 1;
i++;
i = 1;
i = (short)(i+1);

It’s also worth looking at the Verification Type System, as the verifier will check the validity of all transfers from and to local variables:

The type checker enforces a type system based upon a hierarchy of verification types, illustrated below.

Verification type hierarchy:

                             top
                 ____________/\____________
                /                          \
               /                            \
            oneWord                       twoWord
           /   |   \                     /       \
          /    |    \                   /         \
        int  float  reference        long        double
                     /     \
                    /       \_____________
                   /                      \
                  /                        \
           uninitialized                    +------------------+
            /         \                     |  Java reference  |
           /           \                    |  type hierarchy  |
uninitializedThis  uninitialized(Offset)    +------------------+  
                                                     |
                                                     |
                                                    null

So the type system is simplified, compared to the Java language types, and the verifier doesn’t mind, e.g. if you pass a boolean value to a method expecting a char, as both are int types.

Maurice Perry · Answer 2 · 2019-08-27T11:54:16.913

1

Because everything local variable occupies at least a 32-bit slot. Same thing goes for bytes.

edited Aug 27 '19 at 11:54

answered Aug 27 '19 at 11:31

Maurice Perry

9,261
2
12
24

Can you please provide any doc with that information? I didn't know that there is a padding in stack. – Noskol Aug 27 '19 at 11:53
@Noskol you can start with this: https://docs.oracle.com/javase/specs/jvms/se7/html/ – Maurice Perry Aug 27 '19 at 11:56
If someone is curious: https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-2.html#jvms-2.6.1 "A single local variable can hold a value of type boolean, byte, char, short, int, float, reference, or returnAddress. A pair of local variables can hold a value of type long or double." This implies that local variable takes 32bits. – Noskol Aug 27 '19 at 12:06
@Noskol ...or 64bits – Maurice Perry Aug 27 '19 at 12:39
This sort of makes sense from a recursive perspective. I would imagine there could be undesirable overhead to adjust the stack frame to accommodate all primitive types. – WJS Aug 27 '19 at 14:40
Well, that’s not a real explanation. After all, `float` and reference types are also treated like 32 bit values (though the actual implementation may have 64 bit pointers), but there are still dedicated instructions for processing `float` or reference values. The decision to handle all integer values besides `long` the same way, likely has the same motivation as the distinction between type1 and type2 types, but still is different. Regardless of how local variables are actually defined, the bytecode would also work if all `load` and `store` instructions were untyped, as the type is inferable. – Holger Aug 27 '19 at 16:20
@Holger `float`s have 32bits, and use a single slot. `long`s and `double`s have 64 bits, and use two. References use a single slot whether they 32 or 64bits. The model is 32bit-based, but implementations may use 64-bit references. – Maurice Perry Aug 29 '19 at 05:18
@MauricePerry I know that. I even said that. Modelling variable types as type 1 and type 2 does not determine, how they are actually processed, not even how they are actually stored. Not only references may be actually 64 bit, using 64 bit general purpose registers may in fact extend all values to 64 bit. Since this processing is independent from the model, the type1/type2 distinction in the model is actually unnecessary and, fun fact, is not made in the newer `StackMapTable` attribute. So all that doesn’t explain why `boolean`, `byte`, `short`, and `char` are treated like `int`. – Holger Aug 29 '19 at 07:30
@Holger the number of bits is irrelevant. It's the number of slots that count, and all local variables occupies at least one slot, so does everything pushed on the stack. – Maurice Perry Aug 29 '19 at 07:48
Sure. And to push one type 1 value from a local variable slot to a stack slot, you wouldn’t need to distinguish between `iload`, `aload`, or `fload`, as all do exactly the same. Compare with `dup`, which duplicates a type 1 value on the stack, no matter which. Still, the JVM designers decided to distinguish between `I`, `A`, and `F` for loads and stores, whereas `Z`, `B`, `C`, and `S` are treated as `I`. Whether you call it “one slot” or “a 32 bit value”, neither would explain why they made that decision. – Holger Aug 29 '19 at 09:25
@Holger of course it does: Z, B, C, and S are the only types that take less the 32 bits. – Maurice Perry Aug 29 '19 at 09:41
[Who said “the number of bits is irrelevant”](https://stackoverflow.com/questions/57673666/jvm-instructions-sload/57673761?noredirect=1#comment101853369_57673761)? Anyway, since you are ignoring almost everything I wrote, but just repeat yourself, I’ll stop here. – Holger Aug 29 '19 at 09:52
@Holger less than a slot if you prefer. – Maurice Perry Aug 29 '19 at 11:14

JVM instructions - sload

2 Answers2