If you mean runtime performance, the following algorithm (which originally finds the highest set bit) is probably fastest. I've already modified it to return the number of bytes required to encode the integer argument:
private static final int[] DE_BRUIJN_BIT_POSITION_LUT = {
0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
};
public static int nbBytes2(int n) {
n |= n >> 1;
n |= n >> 2;
n |= n >> 4;
n |= n >> 8;
n |= n >> 16;
return DE_BRUIJN_BIT_POSITION_LUT[((n * 0x07C4ACDD) >> 27) & 0x1f] / 8 + 1;
}
Even if it looks more complicated, it does not have any loops or conditional processing, which allows optimal use of modern CPU pipelines.
Comparing the De Bruijn algorithm to your method, your method is ~4 times faster for inputs in the range 0x0-0xff (your method won't branch either). For inputs in the range 0x100-0xfff, my method is 19 times faster, for inputs 0x10000-0xffffff 28 times faster and for inputs >0x1000000 35 times faster. All numbers are valid for my hardware, on other computers it may of course differ.