24

How can I decode an utf-8 string with android? I tried with this commands but output is the same of input:

URLDecoder.decode("hello&//à", "UTF-8");

new String("hello&//à", "UTF-8");

EntityUtils.toString("hello&//à", "utf-8");
Community
  • 1
  • 1
magemello
  • 1,152
  • 3
  • 14
  • 22
  • That String is not in a particular encoding at all. What is it, the problem which you're trying to solve? What exactly do you mean with "decode"? What encoding was it in, did you think? – BalusC May 09 '11 at 22:34
  • try using a local variable to hold the result. Ex: String str = URLDecoder.decode("hello&//à", "UTF-8"); – Akshatha S R Jul 05 '18 at 06:26

3 Answers3

50

A string needs no encoding. It is simply a sequence of Unicode characters.

You need to encode when you want to turn a String into a sequence of bytes. The charset the you choose (UTF-8, cp1255, etc.) determines the Character->Byte mapping. Note that a character is not necessarily translated into a single byte. In most charsets, most Unicode characters are translated to at least two bytes.

Encoding of a String is carried out by:

String s1 = "some text";
byte[] bytes = s1.getBytes("UTF-8"); // Charset to encode into

You need to decode when you have а sequence of bytes and you want to turn them into a String. When yоu dо that you need to specify, again, the charset with which the bytеs were originally encoded (otherwise you'll end up with garblеd tеxt).

Decoding:

String s2 = new String(bytes, "UTF-8"); // Charset with which bytes were encoded 

If you want to understand this better, a great text is "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)"

Itay Maman
  • 30,277
  • 10
  • 88
  • 118
10

the core functions are getBytes(String charset) and new String(byte[] data). you can use these functions to do UTF-8 decoding.

UTF-8 decoding actually is a string to string conversion, the intermediate buffer is a byte array. since the target is an UTF-8 string, so the only parameter for new String() is the byte array, which calling is equal to new String(bytes, "UTF-8")

Then the key is the parameter for input encoded string to get internal byte array, which you should know beforehand. If you don't, guess the most possible one, "ISO-8859-1" is a good guess for English user.

The decoding sentence should be

String decoded = new String(encoded.getBytes("ISO-8859-1"));
Zephyr
  • 6,123
  • 34
  • 33
0

Try looking at decode string encoded in utf-8 format in android but it doesn't look like your string is encoded with anything particular. What do you think the output should be?

Community
  • 1
  • 1
Pete Hamilton
  • 7,730
  • 6
  • 33
  • 58