1

I want to know the size of byte of my interger. Example :

public static void main(String[] args) throws IOException {
   int a = 256;
   System.out.println(nbBytes(a));
}

static byte nbBytes(int value) {
   byte l = 0;
   while (value != 0) {
      value >>>= 8;
      ++l;
   }
   return l;
}

It works perfectly, But i want to optimize this calculation. Have you a proposition ? :D

ctype.h
  • 1,470
  • 4
  • 20
  • 34
Mehdi
  • 2,160
  • 6
  • 36
  • 53
  • Well, you could use a simple "if ladder". (But note that your current scheme will fail for negative numbers.) – Hot Licks Oct 29 '12 at 16:02
  • Like @HotLicks says, this will return 4 for every negative number. Is that what you wanted? – Brian Oct 29 '12 at 16:03

3 Answers3

2

If you mean runtime performance, the following algorithm (which originally finds the highest set bit) is probably fastest. I've already modified it to return the number of bytes required to encode the integer argument:

private static final int[] DE_BRUIJN_BIT_POSITION_LUT = {
      0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
      8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
    };

public static int nbBytes2(int n) {
    n |= n >> 1;  
    n |= n >> 2;
    n |= n >> 4;
    n |= n >> 8;
    n |= n >> 16;
    return DE_BRUIJN_BIT_POSITION_LUT[((n * 0x07C4ACDD) >> 27) & 0x1f] / 8 + 1;
}

Even if it looks more complicated, it does not have any loops or conditional processing, which allows optimal use of modern CPU pipelines.

Comparing the De Bruijn algorithm to your method, your method is ~4 times faster for inputs in the range 0x0-0xff (your method won't branch either). For inputs in the range 0x100-0xfff, my method is 19 times faster, for inputs 0x10000-0xffffff 28 times faster and for inputs >0x1000000 35 times faster. All numbers are valid for my hardware, on other computers it may of course differ.

jarnbjo
  • 33,923
  • 7
  • 70
  • 94
  • This is beautiful. [This thread](http://stackoverflow.com/questions/7365562/de-bruijn-like-sequence-for-2n-1-how-is-it-constructed) is relevant for those who are interested in how this works. – Ted Hopp Oct 29 '12 at 16:44
1

In Java, an int is always a 32-bit, signed 2's-complement value. See, for instance, Section 2.3 of the Java Virtual Machine Specification.

If you want to know the minimum number of bits to store a particular value, you can use Integer.numberOfLeadingZeros to get that:

int bitsNeeded = 32 - Integer.numberOfLeadingZeros(value);

You can then round up to get the number of bytes needed.

If you're running an older version of Java that doesn't include this function, here's the 1.6 source for it:

public static int numberOfLeadingZeros(int i) {
    if (i == 0)
        return 32;
    int n = 1;
    if (i >>> 16 == 0) { n += 16; i <<= 16; }
    if (i >>> 24 == 0) { n +=  8; i <<=  8; }
    if (i >>> 28 == 0) { n +=  4; i <<=  4; }
    if (i >>> 30 == 0) { n +=  2; i <<=  2; }
    n -= i >>> 31;
    return n;
}

Whether this is more efficient than what you are already doing can only be determined, I think, by profiling. It will depend, as well, on the distribution of values that you expect to encounter.

If you only expect to deal with non-negative values, I would do this:

static byte nBytes(int value) {
    if (value < (1 << 8)) return 1;
    if (value < (1 << 16)) return 2;
    if (value < (1 << 24)) return 3;
    return 4;
}

This assumes that you require 1 byte to represent zero. To handle negative numbers, there are two logical choices:

  1. Always return 4
  2. Return the minimum number of bytes necessary to represent the value in 2's complement.

For the second case, I would do the following:

static byte nBytes(int value) {
    if (value < 0) {
        if (value > Integer.MIN_VALUE) {
            value = -value;
            if (value < (1 << 7)) return 1;
            if (value < (1 << 15)) return 2;
            if (value < (1 << 23)) return 3;
        }
    } else {
        if (value < (1 << 8)) return 1;
        if (value < (1 << 16)) return 2;
        if (value < (1 << 24)) return 3;
    }
    return 4;
}
Ted Hopp
  • 232,168
  • 48
  • 399
  • 521
  • I think he wants to see how many bytes the particular value occupies. – Hot Licks Oct 29 '12 at 16:03
  • 2
    Basically he wants to know the minimum number of bytes he needs to represent the value. For example, 30 only needs 1 byte, and 350 needs 2 bytes. All negative numbers require all 4 bytes since you always need the MSB. – Brian Oct 29 '12 at 16:03
  • 1
    @Brian - How you treat a negative number depends on whether you consider the "byte" to be signed or not. Of course, this affects the algorithm for positive numbers as well. – Hot Licks Oct 29 '12 at 16:17
  • Unfortunately the Java source I have at hand is pre-1.5 and doesn't include the function, but I'd bet that `numberOfLeadingZeros` is no more efficiently implemented than the OP's algorithm. There are only so many ways to skin a cat. – Hot Licks Oct 29 '12 at 16:22
  • @HotLicks - Updated my answer with source for `numberOfLeadingZeros` plus my own suggestion for how to code OP's `nBytes` method. – Ted Hopp Oct 29 '12 at 16:42
  • Probably, if negative is treated as n-byte 2's complement, positive should be treated the same way, making the positive cutoffs at 127, 32767, and 8388607, and the negative cutoffs at -128, -32768, and -8388608. – Hot Licks Oct 29 '12 at 18:42
0

I don't know it this is more optimum, but another solution is something like (not tested):

return (byte)Math.ceil(Integer.toBinaryString(value).length()/8.0);
magodiez
  • 741
  • 2
  • 10
  • 23