I am trying to decode UTF8 byte by byte with charset decoder. Is this possible?
The following code
public static void main(String[] args) {
Charset cs = Charset.forName("utf8");
CharsetDecoder decoder = cs.newDecoder();
CoderResult res;
byte[] source = new byte[] {(byte)0xc3, (byte)0xa6}; // LATIN SMALL LETTER AE in UTF8
byte[] b = new byte[1];
ByteBuffer bb = ByteBuffer.wrap(b);
char[] c = new char[1];
CharBuffer cb = CharBuffer.wrap(c);
decoder.reset();
b[0] = source[0];
bb.rewind();
cb.rewind();
res = decoder.decode(bb, cb, false);
System.out.println(res);
System.out.println(cb.remaining());
b[0] = source[1];
bb.rewind();
cb.rewind();
res = decoder.decode(bb, cb, false);
System.out.println(res);
System.out.println(cb.remaining());
}
gives the following output.
UNDERFLOW
1
MALFORMED[1]
1
Why?