6

Here I'm working with a Java to C# sample app translation that involves cryptography (AES and RSA and so on...)

At some point in Java code (the one that actually works and being translated to C#), I've found this piece of code:

for (i = i; i < size; i++) {
    encodedArr[j] = (byte) (data[i] & 0x00FF);
    j++;
} // where data variable is a char[] and encodedArr is a byte[]

After some googling (here), I've seen that this is a common behaviour mainly on Java code...

I know that char is a 16-bit type and byte is 8-bit only, but I couldn't understand the reason for this bitwise and operation for a char->byte conversion.

Could someone explain?

Thank you in advance.

Community
  • 1
  • 1
cezarlamann
  • 1,465
  • 2
  • 28
  • 43
  • Its probably to circumvent a possible overflow if there is some value in the upper byte of the char. Otherwise just casting it would result in an overflow for a byte since it can only hold 255 (FF) as a maximum value. – Ron Beyer May 28 '15 at 19:14

3 Answers3

7

In this case, it is quite unnecessary, but the sort of thing people put in "just in case". According to the JLS, 5.1.3. Narrowing Primitive Conversion

A narrowing conversion of a char to an integral type T likewise simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the resulting value to be a negative number, even though chars represent 16-bit unsigned integer values.

Similar code is often needed in widening conversions to suppress sign extension.

Patricia Shanahan
  • 25,849
  • 4
  • 38
  • 75
  • 1
    It works in Java without the mask/cast, but not in C#. C# is a little more strict about narrowing conversions. – Ron Beyer May 28 '15 at 19:23
  • @RonBeyer Maybe the code was written by someone who was thinking a bit C# although it is stated to be Java. – Patricia Shanahan May 28 '15 at 19:25
  • 2
    The code is Java but he's converting to C#, so its just an important thing to note since it can't be removed on the C# side. – Ron Beyer May 28 '15 at 19:25
  • @RonBeyer what's the story with C# here? In the default `unchecked` context, they use the same rule (to quote, "If the source type is larger than the destination type, then the source value is truncated by discarding its “extra” most significant bits") – harold May 28 '15 at 19:55
  • @harold yes, when you use `unchecked`, but a 1:1 conversion of the code doesn't have that. You would have to use `encodedArr[j] = unchecked((byte)(data[i]));` Edit: checking on this now, maybe I'm wrong. – Ron Beyer May 28 '15 at 20:04
  • 1
    @RonBeyer but it's the default.. all you'd have to do is not check that checkbox in the build settings that sets it to checked – harold May 28 '15 at 20:07
  • @harold Yes, you seem to be right, I hadn't noticed that it was set to use unchecked by default, I always specified it when requiring an unchecked operation. – Ron Beyer May 28 '15 at 20:08
4

When you convert 0x00FF to binary it becomes 0000 0000 1111 1111

When you and anything with 1, it is itself:

1 && 1 = 1, 0 && 1 = 0

When you and anything with 0, it is 0

1 && 0 = 0, 0 && 0 = 0

When this operation occurrs encodedArr[j] = (byte) (data[i] & 0x00FF); it's taking the last 8 bits and the last 8 bits only of the data and storing that. It is throwing away the first 8 bits and storing the last 8

The reason why this is needed is because a byte is defined as an eight bit value. The bitwise and exists to stop a potential overflow -> IE assigning 9 bits into a byte

A char in Java is 2 bytes! This logic is there to stop an overflow. However, as someone pointed out below, this is pointless because the cast does it for you. Perhaps someone was being cautious?

Community
  • 1
  • 1
Dudemanword
  • 710
  • 1
  • 6
  • 20
  • wouldn't you want to split the char into 2 bytes to preserve the information. and I think you meant bits – RadioSpace May 28 '15 at 19:19
  • Absolutely. I did mean bits – Dudemanword May 28 '15 at 19:20
  • Its not really "throwing them away", the bytes are still there, its just making the type small enough that a cast to another type won't result in an overflow. At least thats what the & is doing, the cast throws them away if you want to clarify. – Ron Beyer May 28 '15 at 19:20
  • The bitwise and is zeroing the first 8 bits, thus "Throwing them away" – Dudemanword May 28 '15 at 19:25
  • It's completely pointless in Java, though. The high byte is already discarded by the cast. – Radiodef May 28 '15 at 19:40
2

It's a way to truncate the value by keeping only the least significat bits so it "fits" in a byte!

I hope it helps!

tooomg
  • 469
  • 2
  • 7