I have a string as follows:
this is the string u00c5 with missing slash before unicode characters
It has unicode character codes but all the backslashes before the "u" is missing. How can print this string correctly?
What I have done?
I tried to add a backslash before the incomplete unicode part using the following code. However, "\u$1"
is not allowed in replaceAll
.
public String sanitizeUnicodeQuirk(String input) {
try {
// String processedInput = input.replaceAll("[uU]([0123456789abcdefABCDEF]{4})", String.valueOf(Integer.parseInt("$1", 16))); // $1 is taken literally which makes valuOf and parseInt useless
String processedInput = input.replaceAll("[uU]([0123456789abcdefABCDEF]{4})", "\\\\u$1"); // Cannot make "\u$1"
String newInput = new String(processedInput.getBytes(), "UTF-8");
return newInput;
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
return input;
}