-6

I have a string that might contain Latin symbols and / or letters.

How do I take that String and convert it to UTF-8 encoded String?

For example if my String is:

"óóó"

I wish to convert it to be:

"óóó"

Tal Angel
  • 1,301
  • 3
  • 29
  • 63
  • Hope this answers your question! https://stackoverflow.com/questions/5729806/encode-string-to-utf-8 – user12758604 Mar 11 '20 at 15:09
  • 1
    Java strings are Java strings - they are not "encoded". It seems to me that your question is "How do I take data that might be encoded using latin-1 text format and interpret it as utf-8 instead? – ControlAltDel Mar 11 '20 at 15:11
  • 1
    Don’t use a String to hold arbitrary bytes. A string in C can do that, but not in Java. In Java, bytes belong in a `byte[]` array. A String holds characters, not bytes, and converting between the two will always risk losing or corrupting data. – VGR Mar 11 '20 at 16:31

1 Answers1

0
public static void main(String [] args) {
    String input = "ÁÉÍÓÚÜÑáéíóúüñ¡¿";
    //simulate ISO_8859 input
    ByteBuffer iso = StandardCharsets.ISO_8859_1.encode(input);
    CharBuffer buffer = StandardCharsets.ISO_8859_1.decode(iso);
    ByteBuffer byteBuffer = StandardCharsets.UTF_8.encode(buffer);
    System.out.println(new String(byteBuffer.array()));
}
Ryan
  • 1,762
  • 6
  • 11
  • I suggest using [`StandardCharsets.ISO_8859_1`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/nio/charset/StandardCharsets.html#ISO_8859_1) and [`StandardCharsets.UTF_8`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/nio/charset/StandardCharsets.html#UTF_8) – user85421 Mar 11 '20 at 16:02
  • @Ryan This is my string: Á\nÉ\nÍ\nÓ\nÚ\nÜ\nÑ\ná\né\ní\nó\nú\nü\nñ\n¡\n¿\n Your code turned it into: Ã\nÃ\nÃ\nÃ\nÃ\nÃ\nÃ\ná\né\ní\nó\nú\nü\nñ\n¡\n¿\n what is ? – Tal Angel Mar 11 '20 at 16:07
  • I was assuming you'd replace the quoted string with an actual ISO_8859_1 string. Java treats string as UTF-8 by default. I'll update my code to make it more obvious. – Ryan Mar 11 '20 at 16:21