6

For converting a string, I am converting it into a byte as follows: byte[] nameByteArray = cityName.getBytes();

To convert back, I did: String retrievedString = new String(nameByteArray); which obviously doesn't work. How would I convert it back?

darksky
  • 20,411
  • 61
  • 165
  • 254
  • 1
    you need to specify the charsetname on `new String()`, for example `new String(byte[], "utf-8");`. Use the same charset as the original string. – Augusto Sep 13 '11 at 12:37
  • 1
    That's how you are supposed to convert it back. eg http://ideone.com/TDb7E Can you explain exactly what doesn't work? – Bala R Sep 13 '11 at 12:37
  • 1
    Read [the canonical essay](http://www.joelonsoftware.com/articles/Unicode.html) to understand why you need to specify the encoding when converting bytes to a string. – dm3 Sep 13 '11 at 12:40
  • possible duplicate of [What is character encoding and why should I bother with it](http://stackoverflow.com/questions/10611455/what-is-character-encoding-and-why-should-i-bother-with-it) – Raedwald Apr 10 '15 at 12:24

2 Answers2

10

What characters are there in your original city name? Try UTF-8 version like this:

byte[] nameByteArray = cityName.getBytes("UTF-8");
String retrievedString = new String(nameByteArray, "UTF-8");
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • That shouldn't be the issue since both getBytes() and String(byte[] byteArray) use the default charset which is obviously the same in both cases - assuming he is doing this on single machine though. – Jan Zyka Sep 13 '11 at 12:40
  • +1, explains it all. "What characters" = "What's the actual character encoding". – Andreas Dolk Sep 13 '11 at 12:40
  • 2
    @Jan: It would only work if the default character encoding is able to encode all of the characters in the existing text. – Jon Skeet Sep 13 '11 at 12:43
  • Seems like it works but when I do the following: `System.out.print("PTRSP - "); System.out.println(retrievedString);`, The first character always that prints out is a 6. So it the above code prints out as: `6TRSP - ??Albuquerque`. Also why do I get `??` in the beginning? – darksky Sep 13 '11 at 15:16
  • That means you have cityName contains some non printable characters? Can you tell me what is character encoding of original string variable cityName? Also dump the char array of original String variable cityName and paste their character codes here. – anubhava Sep 13 '11 at 15:25
  • Can you actually dump the char array of a string in Java?? how would you do that – darksky Sep 13 '11 at 20:59
  • This is how you get a hex dump of all the characters in your String. `char[] chArr = cityName.toCharArray(); for (char ch : chArr) System.out.println(Integer.toHexString(ch)); ` – anubhava Sep 14 '11 at 05:07
5

which obviously doesn't work.

Actually that's exactly how you do it. The only thing that can go wrong is that you're implicitly using the platform default encoding, which could differ between systems, and might not be able to represent all characters in the string.

The solution is to explicitly use an encoding that can represent all characts, such as UTF-8:

byte[] nameByteArray = cityName.getBytes("UTF-8");

String retrievedString = new String(nameByteArray, "UTF-8");
Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720