I'm particularly concerned with using enums in switch statements. I'll be using a program to count the number of occurrences of a finite set of symbols in a very long array.
First define some constants,
static final int NUMSYMBOLS = Integer.MAX_VALUE/100; // size of array
// Constants for symbols ZERO ... NINE
static final int ZERO_I =0, ONE_I =1, TWO_I =2, THREE_I =3, FOUR_I =4;
static final int FIVE_I =5, SIX_I =6, SEVEN_I =7, EIGHT_I =8, NINE_I =9;
and a corresponding enum.
enum Symbol {
ZERO (0), ONE (1), TWO (2), THREE (3), FOUR (4),
FIVE (5), SIX (6), SEVEN (7), EIGHT (8), NINE (9);
final int code;
Symbol(int num) {
code = num;
}
public final int getCode() {
return code;
}
}
The enum has a field code set by a constructor. We will use this code in our testing
later, which can yield some speed-up.
The set of symbols is stored in an array, and a corresponding int array.
Symbol[] symbolArray;
int[] intArray;
The symbols are counted in a method.
void testEnum() {
for(int i=0;i<NUMSYMBOLS;++i) {
Symbol sym = symbolArray[i];
switch(sym) {
case ZERO: ++numZero; break;
case ONE: ++numOne; break;
case TWO: ++numTwo; break;
case THREE: ++numThree; break;
case FOUR: ++numFour; break;
case FIVE: ++numFive; break;
case SIX: ++numSix; break;
case SEVEN: ++numSeven; break;
case EIGHT: ++numEight; break;
case NINE: ++numNine; break;
default: break;
}
}
}
and similar method for integers.
void testInteger() {
for(int i=0;i<NUMSYMBOLS;++i) {
int num = intArray[i];
switch(num) {
case ZERO_I: ++numZero; break;
case ONE_I: ++numOne; break;
case TWO_I: ++numTwo; break;
case THREE_I: ++numThree; break;
case FOUR_I: ++numFour; break;
case FIVE_I: ++numFive; break;
case SIX_I: ++numSix; break;
case SEVEN_I: ++numSeven; break;
case EIGHT_I: ++numEight; break;
case NINE_I: ++numNine; break;
default:
break;
}
}
}
We can use the code from the Enum to make the switch a little more efficient.
void testEnumCode() {
for(int i=0;i<NUMSYMBOLS;++i) {
Symbol sym = symbolArray[i];
switch(sym.getCode()) { // Uses the code here
case ZERO_I: ++numZero; break;
case ONE_I: ++numOne; break;
case TWO_I: ++numTwo; break;
case THREE_I: ++numThree; break;
case FOUR_I: ++numFour; break;
case FIVE_I: ++numFive; break;
case SIX_I: ++numSix; break;
case SEVEN_I: ++numSeven; break;
case EIGHT_I: ++numEight; break;
case NINE_I: ++numNine; break;
default:
break;
}
}
}
Running the three methods 10 time each. Gives the following timings.
Totals enum 2,548,251,200ns code 2,330,238,900ns int 2,043,553,600ns
Percentages enum 100% code 91% int 80%
Giving a noticeable time improvement for using integers. Using the code field gives timing half-way between enums and ints.
These difference in timing can easily disappear by the surrounding code. For instance of using an ArrayList rather than an array makes the timings difference vanish
completely.
There is another option in using the Enum.ordinal()
method. This has perfomance similar to using a getCode(). The why and wherfore of this methods depend on are discussed at Is it good practice to use ordinal of enum?.
In my application, a reverse polish calculator, this loop and switch statement, is the heart of the program, run millions of times, and it comes up in performance analysis.
There enums are used for opcodes: PUSH, POP, etc. and each command consist of an opcode with additional arguments.
enum OpCode {
PUSH(0), POP(1), ...;
private final int code;
OpCode(int n) { code=n; }
public int getCode() { return code; }
}
class Command {
OpCode op;
int code;
String var;
Command (OpCode op,String name) {
this.op = op;
this.code = op.getCode();
this.var = name;
}
}
Building the list of commands can use the enum, without needing to know about the actual int values.
Command com = new Command(OpCode.PUSH,"x");
For non critical parts of the code we can use the enum in a switch. Say in the toString() method of the Command.
public String toString() {
switch(op) {
case PUSH:
return "Push "+var;
....
}
}
But critical parts can use the code.
public void evaluate(Command com) {
switch(com.code) {
case 0:
stack.push(com.var);
break;
....
}
}
for that extra bit of performance.
The byte code of the switch statements are interesting. In the int examples the swicth statment compiles to:
private void testInteger(int);
Code:
0: iload_1
1: tableswitch { // 0 to 9
0: 56
1: 69
2: 82
3: 95
4: 108
5: 121
6: 134
7: 147
8: 160
9: 173
default: 186
}
56: aload_0
57: dup
58: getfield #151 // Field numZero:I
61: iconst_1
62: iadd
63: putfield #151 // Field numZero:I
66: goto 186
69: aload_0
70: dup
71: getfield #153 // Field numOne:I
74: iconst_1
75: iadd
76: putfield #153 // Field numOne:I
79: goto 186
....
The tableswitch command efficiently jumps forward in the code depending on the value.
The code for the switch using the code (or ordinal) is similar. Just with an extra call to the getCode() method.
private void testCode(toys.EnumTest$Symbol);
Code:
0: aload_1
1: invokevirtual #186 // Method toys/EnumTest$Symbol.getCode:()I
4: tableswitch { // 0 to 9
0: 60
1: 73
2: 86
3: 99
4: 112
5: 125
6: 138
7: 151
8: 164
9: 177
default: 190
....
Using just the enum the code is more complex.
private void testEnum(toys.EnumTest$Symbol);
Code:
0: invokestatic #176
// Method $SWITCH_TABLE$toys$EnumTest$Symbol:()[I
3: aload_1
4: invokevirtual #179 // Method toys/EnumTest$Symbol.ordinal:()I
7: iaload
8: tableswitch { // 1 to 10
1: 64
2: 77
3: 90
4: 103
5: 116
6: 129
7: 142
8: 155
9: 168
10: 181
default: 194
}
Here there is first a call to a new method $SWITCH_TABLE$toys$EnumTest$Symbol:()
this method creates an array translating the ordinal values to an index used in the switch. Basically its equivalent to
int[] lookups = get_SWITCH_TABLE();
int pos = array[sys.ordinal()];
switch(pos) {
...
}
The switch table creation method, calculates the table once on its first call, and uses the same table on each subsequent call. So we see two quite trivial function calls and one extra array lookup when compared to the integer case.