18

How can I convert a array of bytes to String without conversion?.

I tried:

  String doc=new String( bytes);

But the doc file is not the same than the bytes (the bytes are binary information). For example:

  String doc=new String( bytes);
  byte[] bytes2=doc.getBytes();

bytes and bytes2 are different.

PS: UTF-8 Does not work because it convert some bytes in different values. I tested and it does not work.

PS2: And no, I don't want BASE64.

Ali
  • 3,373
  • 5
  • 42
  • 54
magallanes
  • 6,583
  • 4
  • 54
  • 55
  • you have to use a proper encoding – nachokk Jul 10 '13 at 15:37
  • @TheNewIdiot the answer in that post solve nothing. I wish for a byte to byte conversion and the answer say "convert it or bust". How is it possible that Java can't do that?. – magallanes Jul 10 '13 at 15:41
  • Java makes a superb distinction between binary data (bytes) and text (String). For text they chose internally Unicode, so all languages are covered. Though you can use an encoding like ISO-8559-1 to convert bytes as they are to a String and vice versa, these Strings may have artifacts like a binary 0. – Joop Eggen Jul 10 '13 at 16:15
  • You almost certainly _do_ want Base64, which is the only way you're going to get reversible byte-to-String encoding. – Louis Wasserman Jul 10 '13 at 16:27

3 Answers3

16

You need to specify the encoding you want e.g. for UTF-8

String doc = ....
byte[] bytes = doc.getBytes("UTF-8");
String doc2 = new String(bytes, "UTF-8");

doc and doc2 will be the same.

To decode a byte[] you need to know what encoding was used to be sure it will decode correctly.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
14

Here's one way to convert an array of bytes into a String and back:

String doc=new String(bytes, "ISO-8859-1");
byte[] bytes2=doc.getBytes("ISO-8859-1");

A String is a sequence of characters, so you'll have to somehow encode bytes as characters. The ISO-8859-1 encoding maps a single, unique character for each byte, so it's safe to use it for the conversion. Note that other encodings, such as UTF-8, are not safe in this sense because there are sequences of bytes that don't map to valid strings in those encodings.

Joni
  • 108,737
  • 14
  • 143
  • 193
1

The "proper conversion" between byte[] and String is to explicitly state the encoding you want to use. If you start with a byte[] and it does not in fact contain text data, there is no "proper conversion". Strings are for text, byte[] is for binary data, and the only really sensible thing to do is to avoid converting between them unless you absolutely have to.

If you really must use a String to hold binary data then the safest way is to use Base64 encoding.

Source by Michael Borgwardt

Community
  • 1
  • 1
Stephan
  • 16,509
  • 7
  • 35
  • 61
  • What if the string is only a representation? and on converting back to the byte array we use proper conversion methods? – Eftekhari Mar 02 '16 at 23:59