CharBuffer and ByteBuffer - charset encoding

Question

Java stores characters in UCS-2 format.

    byte[] bytes = {0x00, 0x48, 0x00, 0x69, 0x00, 0x2c,
                    0x60, (byte)0xA8, 0x59, 0x7D, 0x00, 0x21};
    // Print UCS-2 in hex codes
    System.out.printf("%10s", "UCS-2");
    for(int i=0; i<bytes.length; i++) {
        System.out.printf("%02x", bytes[i]);
    }

1) In the below code,

    Charset charset = Charset.forName("UTF-8");
    // Encode from UCS-2 to UTF-8
    // Create a ByteBuffer by wrapping a byte array
    ByteBuffer bb = ByteBuffer.wrap(bytes);

What is the byte order used to store bytes in bb on wrap()? BigEndian or LittleEndian?

2) In the below code,

    // Create a CharBuffer from a view of this ByteBuffer
    CharBuffer cb = bb.asCharBuffer();
    ByteBuffer bbOut = charset.encode(cb);

What is the encoding format used to store bytes of bb as characters in cb on asCharBuffer()?

I don't understand the first question. There are no characters or strings, so where does encoding come into the picture? — shmosel, Oct 31 '17 at 05:37
@shmosel Does `bb[0]` hold `0x21` or Does `bb[11]` hold `0x21`? — overexchange, Oct 31 '17 at 05:40
`bb[11]` does, assuming we're talking about the same byte array. — shmosel, Oct 31 '17 at 05:42
@shmosel Yes, MSB position and LSB position of bits stored in each byte of `ByteBuffer` compared to `byte` datatype — overexchange, Oct 31 '17 at 05:45
You're wrapping an existing byte array. The buffer's endianness won't have any effect until you start writing to it. — shmosel, Oct 31 '17 at 05:48
*Java stores characters in UCS-2 format.* This is false. Until Java 9, strings were stored as simple `char[]` arrays. Since Java 9, they're encoded in one of two byte encodings, neither of which are UCS-2. — shmosel, Oct 31 '17 at 05:48
@shmosel [Answer](https://stackoverflow.com/a/36236799/3317808) says, java used UCS-2, UCS-2 exhausts after 65535 code point. Now they use UTF-16 covering extra planes with backward compatibility of UCS-2 — overexchange, Oct 31 '17 at 07:34
Ok, I understand what you're saying now. But I still don't see your point. — shmosel, Oct 31 '17 at 08:08
Your second question is answered [here](https://stackoverflow.com/questions/6750123/what-charset-does-bytebuffer-ascharbuffer-use). — shmosel, Oct 31 '17 at 08:09
@shmosel So, Can I say, for second question, `CharBuffer cb = bb.asCharBuffer()` is equivalent to saying `Charset cset = Charset.forName("UTF-16");CharBuffer cb = cset.decode(bb)`? — overexchange, Oct 31 '17 at 16:43

CharBuffer and ByteBuffer - charset encoding

0 Answers0