1

I am trying to convert a byte array of 4 to an integer. The code i am using is this:

    byte[] bytes = new byte[4];
    bytes[0] = (byte)0x95;
    bytes[1] = (byte)0x19;
    bytes[2] = (byte)0x07;
    bytes[3] = (byte)0x00;
    
    int number = bytes[0] + (bytes[1] << 8) + (bytes[2] << 16) + (bytes[3] << 24);
    

When running this exact code on my machine, the expected result would be 0x00071995, but somehow i get the result 0x00071895. How? Why?

java 19.0.1 2022-10-18 Java(TM) SE Runtime Environment (build 19.0.1+10-21)

  • If I cast every byte[] element to long and i do a logical and with 0xFF i get the expected result. But i am still wondering why i get the bad result in the first place. Unsigned and signed addition should not give any differences in binary – alin tudose Jan 26 '23 at 20:41
  • 1
    A nicer way to do what you're trying to do is with the [`ByteBuffer`](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/nio/ByteBuffer.html) class. Call the `wrap` method followed by the `getInt` method. – Dawood ibn Kareem Jan 26 '23 at 21:21

1 Answers1

9

A java byte is signed, and 0x95 represents a negative number with value -107 (since the leading bit is set). When it's promoted to a wider integer as part of the arithmetic operation, it ends up being sign-extended to 0xffffff95 (i.e. still -107), or 256 less than your desired value, thus causing the discrepancy in the next byte up.

Unsigned and signed addition should not give any differences in binary

There is no difference as long as you stay within the same width of integer type. The quantity 0x95 equivalent to the quantity -0x6B, modulo 256. However, the two are not equivalent modulo 4294967296. 0x95 simply does not overflow the signed range of a 32-bit int.

Another way of looking at this - signed and unsigned addition themselves do not give a difference in the binary representation. However, signed and unsigned promotion do cause a difference - a signed value is sign-extended (i.e. padded with 1s at the left when the sign bit was already 1) but an unsigned value is zero extended. A signed 8-bit 0x95 sign-extends to 0xffffff95, and an unsigned 8-bit 0x95 zero-extends to 0x00000095.

If I cast every byte[] element to long and i do a logical and with 0xFF i get the expected result

This ensures that you zero-extend instead of sign-extending; the undesired leading sign bits are forced to zero by the logical AND.

nanofarad
  • 40,330
  • 4
  • 86
  • 117
  • Why does `ByteBuffer.wrap(bytes).order(LITTLE_ENDIAN).getInt()` give `0x00071995`, as apparently expected? – Slaw Jan 26 '23 at 20:49
  • 1
    @Slaw Wrapping bytes and reinterpreting/reading them doesn't involve a sign-extension *or* zero extension (and the sign-extension is the crux of the issue here). It just sticks the bytes one after the other and reads them. The sign-extension introduced an undesired 0xffffff00 (i.e. -256) which then affected the second byte from the right. – nanofarad Jan 26 '23 at 20:51