This is similar to this question, but I specifically need to know how to convert to ISO-8859-1 format, not UTF-8.
Short question: I need a character with combining diaereses converted to the Latin-1 equivalent (if it exists).
Longer question: I have German strings that contain combining diaereses (UTF-8: [cc][88] AKA UTF code point U+0308), but my database only supports ISO-8859-1 (e.g. Latin-1). Because the characters/combining diaereses are "decomposed", I can't just "convert" to ISO-8859-1 because the byte sequence [cc][88] acts on the preceding character, which may not have a corresponding character in ISO-8859-1.
I tried this code:
import java.nio.charset.Charset;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
//ü has combining diaereses
String s = "für"
Charset utf8charset = Charset.forName("UTF-8");
Charset iso88591charset = Charset.forName("ISO-8859-1");
ByteBuffer inputBuffer = ByteBuffer.wrap(s.getBytes());
// decode UTF-8
CharBuffer data = utf8charset.decode(inputBuffer);
// encode ISO-8559-1
ByteBuffer outputBuffer = iso88591charset.encode(data);
byte[] outputData = outputBuffer.array();
isoString = new String(outputData);
//isoString is "fu?r"
But it just fails to encode the combining diaereses rather than seeing that it could convert to U+00F6/[c3][bc]. Is there a library that can detect when a character followed by combining diaereses can map to an existing ISO-8859-1 character? (Preferably in Java)