0

I want to convert the character cor from Shift-JIS to UTF-8, but it does not work as shown below.
I would appreciate it if you could show me how to convert to UTF-8.

String str = "���[�_�u���R�[�h,�������킩��₷���B"
byte[] bytes1 = str.getBytes("SJIS");
String newstr = new String(bytes1,"UTF-8");
System.out.println(newstr);
=> ???[?_?u???R?[?h,??????????????B

I'm using Java 17.

Ivar
  • 6,138
  • 12
  • 49
  • 61
katahik
  • 427
  • 3
  • 14
  • I think you have to read the characters as SJIS first and then convert them to utf8. As it is now, you are not sure that this string is SJIS and it can be also utf8 – Melron Nov 24 '22 at 10:18
  • 2
    The string you start out with is mangled and can no longer be used for anything useful. If this truly is your input, the error has already occurred and is no longer fixable; go back in the process: Presumably it starts out somewhere with actual text, then something converts that to bytes by encoding with UTF-8, and then something else decodes with SJIS. That's the problem. Either fix the encoding step (also use SJIS) or fix the decoding (use UTF-8). If that's beyond your control, there is nothing you can do here. – rzwitserloot Nov 24 '22 at 10:22
  • The duplicate question is about another encoding, but the basic answer is the same (as @rzwitserloot correctly pointed out): you can't, it's too late, information has been irrecoverably lost, you need to fix the code at an earlier point (apply the correct encoding where the bytes are first converted to a `String`). – Joachim Sauer Nov 24 '22 at 11:57
  • It seems that the process where the file was originally loaded was incorrect. Thank you very much. – katahik Nov 25 '22 at 10:22

0 Answers0