I am working with java 8 and I18N. From my understandings, the .properties files (and subsequent I18N code) asumes that the files are in the "ISO-8859-1" file format. Thus I'm having trouble with characters that cannot be represented in that file format.
Changing from a file writer to an OutputStreamWriter won't help since the other end of the code won't be able to read these characters anyway.
I did come up with a solution that works, but it is highly inelegant.
StringBuilder utfRepresentation = new StringBuilder();
for (int index = 0; index < input.length(); index++) {
if (!Charset.forName("ISO-8859-1").newEncoder().canEncode(input.charAt(index))) {
utfRepresentation.append("\\u");
utfRepresentation.append(Integer.toHexString(input.codePointAt(index)));
} else {
utfRepresentation.append(input.charAt(index));
}
}
Now I do need to do other things like extract the encoder instead of making a new one every time, but my question is something else entirely:
1) Is there a cleaner way of transforming ‰
into \u2030
2) What even is this U+2030? UTF-8/16?
3) Is there a better way of creating that charset / encoder? Something that isn't static? can I extract it from the file? or a file reader / writer?