1

I would like to read the bytes into the direct ByteBuffer and then decode them without rewrapping the original buffer into the byte[] array to minimize memory allocations.

Hence I'd like to avoid using StandardCharsets.UTF_8.decode() as it allocates the new array on the heap.

I'm stuck on how to decode the bytes. Consider the following code that writes a string into the buffer and then reads id again.

ByteBuffer byteBuffer = ByteBuffer.allocateDirect(2 << 16);

byteBuffer.put("Hello Dávid".getBytes(StandardCharsets.UTF_8));

byteBuffer.flip();

CharBuffer charBuffer = byteBuffer.asCharBuffer();
for (int i = charBuffer.position(); i < charBuffer.length(); i++) {
    System.out.println(charBuffer.get());
}

The code output:

䡥汬漠

How can I decode the buffer?

David Siro
  • 1,826
  • 14
  • 33
  • 3
    You need to flip the `CharBuffer,` not the `ByteBuffer.` The `CharBuffer` doens't inherit the flipped state. – user207421 Apr 14 '17 at 11:07

2 Answers2

2

I would like to read the bytes into the direct ByteBuffer and then decode them without rewrapping the original buffer into the byte[] array to minimize memory allocations.

ByteBuffer.asCharBuffer() fits your need, indeed, since both wrappers share the same underlying buffer.

This method's javadoc says:

The new buffer's position will be zero, its capacity and its limit will be the number of bytes remaining in this buffer divided by two

Although it's not explicitly said, it's a hint that CharBuffer uses UTF-16 character encoding over the given buffer. Since we don't have control over what encoding the charbuffer uses, you don't have much choice but to necessarily write the character bytes in that encoding.

byteBuffer.put("Hello Dávid".getBytes(StandardCharsets.UTF_16));

One thing about your printing for loop. Be careful that CharBuffer.length() is actually the number of remaining chars between the buffer's position and limit, so it decreases as you call CharBuffer.get(). So you should use get(int) or change the for termination condition to limit().

nandsito
  • 3,782
  • 2
  • 19
  • 26
1

You can't specify the encoding of a CharBuffer. See here: What Charset does ByteBuffer.asCharBuffer() use?

Also, since buffers are mutable, I don't see how you could ever possibly create a String from it which are always immutable without doing a memory re-allocation...

Community
  • 1
  • 1
john16384
  • 7,800
  • 2
  • 30
  • 44
  • Thanks John, but my question still holds. How can I decode the characters from the direct ByteBuffer without additional memory allocations? – David Siro Apr 14 '17 at 09:29
  • Your question was `How can I specify the encoding on a direct CharBuffer?` though... – john16384 Apr 14 '17 at 09:30
  • IMHO this should be a comment, or a duplicate close vote, not an anwser. – glee8e Apr 14 '17 at 09:30
  • And as I already said in my answer, you can't do this as `String` in Java is immutable, so there is no way that you can do this without a copy (otherwise you could modify the `String` by modifying the buffer...) – john16384 Apr 14 '17 at 09:30
  • Fair enough, will rephrase it. I'm more interested in reading single chars than the whole string. – David Siro Apr 14 '17 at 09:31