0

Since our system has now switched to UTF-8 I have to replace all existing Unicode_escapes with the corresponding UTF-8 chars. Unfortunately my code does not work and I don't know why

I feed the method with a string e.g.
ui.activityFeed.currentActivities=Aktuelle Aktivit\u00e4ten

and expect this as the return that comes back:
ui.activityFeed.currentActivities=Aktuelle Aktivitäten

  private static String replaceUmlaute(String line) {
    System.out.println(line);
    final ByteBuffer buffer = StandardCharsets.UTF_8.encode(line);
    final String utf8EncodedString = StandardCharsets.UTF_8.decode(buffer).toString();
    System.out.println(utf8EncodedString);
    return utf8EncodedString;
  }

Result:
ui.activityFeed.currentActivities=Aktuelle Aktivit\u00e4ten
ui.activityFeed.currentActivities=Aktuelle Aktivit\u00e4ten

I already tried the replace method but that didn't work either Thanks for your help

Pascal
  • 23
  • 7
  • Maybe this https://stackoverflow.com/a/24046962/6809437 helps because it looks like that you have property files. – FaltFe Sep 07 '20 at 06:08
  • Thank you very much. That solved my problem. – Pascal Sep 07 '20 at 06:16
  • "I have to replace all existing Unicode_escapes with the corresponding UTF-8 chars": no, this is not necessary. Conceptually a string in Java does not have an encoding it is just a sequence of characters. Encoding becomes only relevant when converting to or from bytes. – Henry Sep 07 '20 at 06:44
  • 1
    With property files, there's a hazard if you go that road, replacing unicode escapes with unicode characters. Make absolutely sure that the property files are always read with the `load(Reader reader)` method (and a Reader based on UTF-8), and never with `load(InputStream inStream)`. The latter, older one assumes ISO-8859-1 encoding for the files and will produce nonsense from UTF-8 files. So, I'd recommend to keep the unicode escapes and stay within the ASCII character set. That will work with both `load()` variants. – Ralf Kleberhoff Sep 07 '20 at 08:28
  • Actually the job was assigned to me so I have no choice but to exchange the Escapes ^^. Anyway, it works now - halfway... The escapes in the rukish files do not get converted correctly. E.g. \u015E does not become Ş but ? Do you have an idea why? – Pascal Sep 08 '20 at 05:50

1 Answers1

0

That was my solution:

  private static String replaceUmlaute(String line) throws IOException {
    final Properties p = new Properties();
    p.load(new StringReader("key=" + line));
    return p.getProperty("key");
  }
Pascal
  • 23
  • 7