TL;DR Don't use toBinaryString()
. See solution at the end.
Your problem is that Integer.toBinaryString()
doesn't return leading zeroes, e.g.
System.out.println(Integer.toBinaryString(1)); // prints: 1
System.out.println(Integer.toBinaryString(10)); // prints: 1010
System.out.println(Integer.toBinaryString(100)); // prints: 1100100
For your purpose, you want to always get 8 bits for each byte.
You also need to prevent negative values from causing errors, e.g.
System.out.println(Integer.toBinaryString((byte)129)); // prints: 11111111111111111111111110000001
Easiest way to accomplish that is like this:
Integer.toBinaryString((b & 0xFF) | 0x100).substring(1)
First, it coerces the byte b
to int
, then retains only lower 8 bits, and finally sets the 9th bit, e.g. 129
(decimal) becomes 1 1000 0001
(binary, spaces added for clarity). It then excludes that 9th bit, in effect ensuring that leading zeroes are in place.
It's better to have that as a helper method:
private static String toBinary(byte b) {
return Integer.toBinaryString((b & 0xFF) | 0x100).substring(1);
}
In which case your code becomes:
StringBuilder binaryStr = new StringBuilder();
for (byte b : str.getBytes("UTF-8"))
binaryStr.append(toBinary(b));
String result = binaryStr.toString();
E.g. if str = "Hello World"
, you get:
0100100001100101011011000110110001101111001000000101011101101111011100100110110001100100
You could of course just do it yourself, without resorting to toBinaryString()
:
StringBuilder binaryStr = new StringBuilder();
for (byte b : str.getBytes("UTF-8"))
for (int i = 7; i >= 0; i--)
binaryStr.append((b >> i) & 1);
String result = binaryStr.toString();
That will probably run faster too.