6

I am creating two arrays in c++ which will be read in java side:

env->NewDirectByteBuffer
env->NewByteArray

Do these functions copy the buffer I send it? Do I need to create the buffer on the heap in the c++ side or is it ok to create it on the stack because the jvm will copy it?

for example will this code run ok:

std::string stam = "12345";
const char *buff = stam.c_str();
jobject directBuff = env->NewDirectByteBuffer((void*)buff, (jlong) stam.length() );

Another example:

std::string md5 "12345";    
jbyteArray md5ByteArray = env->NewByteArray((jsize) (md5.length()));
env->SetByteArrayRegion(md5ByteArray, 0, (jsize) (md5.length()), (jbyte*)    
 md5.c_str());

string is created on the stack. Will this code always work or do I need to create those strings on the heap and be responsible to delete it after java finishes using it

Robert Karl
  • 7,598
  • 6
  • 38
  • 61
Shay
  • 633
  • 2
  • 11
  • 27

2 Answers2

8

Your use of DirectByteBuffer will almost certainly fail in spectacular, core-dumping, and unpredictable ways. And its behavior may vary between JVM implementations and operating systems. The problem is that your direct memory must remain valid for the lifetime of the DirectByteBuffer. Since your string is on the stack, it will go out of scope rather quickly. Meanwhile the Java code may or may not continue to use the DirectByteBuffer, depending on what it is. Are you writing the Java code too? Can you guarantee that its use of the DirectByteBuffer will be complete before the string goes out of scope?

Even if you can guarantee that, realize that Java's GC is non-deterministic. It is all too easy to think that your DirectByteBuffer isn't being used any more, but meanwhile it is wandering around in unreclaimed objects, which eventually get hoovered up by the GC, which may call some finalize() method that accidentally touches the DirectByteBuffer, and -- kablooey! In practice, it is very difficult to make these guarantees except for blocks of "shared memory" that never go away for the life of your application.

NewDirectByteBuffer is also not that fast (at least not in Windows), despite the intuitive assumption that performance is what it is all about. I've found experimentally that it is faster to copy 1000 bytes than it is to create a single DirectByteBuffer. It is usually much faster to have your Java pass a byte[] into the C++ and have the C++ copy bytes into it (ahem, assuming they fit). Overall, I make these recommendations:

  1. Call NewByteArray() and SetByteArrayRegion(), return the resulting jBytearray to Java and have no worries.
  2. If performance is a requirement, pass the byte[] from Java to C++ and have C++ fill it in. You might need two C++ calls, one to get the size and the next to get the data.
  3. If the data is huge, use NewDirectBtyeBuffer and make sure that the C++ data stays around "forever", or until you are darn certain that the DirectByteBuffer has been disposed.

I've also read that both C++ and Java can memory-map the same file, and that this works very well for large data.

Wheezil
  • 3,157
  • 1
  • 23
  • 36
  • Thanks for the good answer. The java side cant create the bytearray and send it to the c++ side to fill because he doesnt know the size of the array. Thats why I need the c++ side to create the buffer – Shay Mar 02 '15 at 09:29
  • 1
    Shay, that's why I mentioned using a two-step approach. The first call from Java to C++ asks for the size, Java allocates the byte[], and the second call from Java to C++ asks for the array to be filled now that the size is known. Because calls from Java to C++ are very fast (unlike C++ calling Java which is slow), this doesn't add much to the overall time. We use this technique in production and it works well. – Wheezil Jan 17 '19 at 21:41
  • @Wheezil is the second option achieved via GetByteArrayElements() on the passed byte[], followed by a ReleaseByteArrayElements()? also how do we guarantee that the passed byte[] isn't a copy? – Kevin Jun 11 '20 at 04:48
  • 1
    @Kevin when you pass a byte[] from java to C++, the jbyteArray parameter is not a copy, just like every object you pass from Java to C++ as method parameter – Wheezil Jun 12 '20 at 11:57
  • While I failed to mention it, in the second option you call SetByteArrayRegion(), not GetByteArrayElements(). – Wheezil Jan 09 '23 at 00:14
1
  • NewDirectByteBuffer: "Allocates and returns a direct java.nio.ByteBuffer referring to the block of memory starting at the memory address address and extending capacity bytes.

    "Native code that calls this function and returns the resulting byte-buffer object to Java-level code should ensure that the buffer refers to a valid region of memory that is accessible for reading and, if appropriate, writing. An attempt to access an invalid memory location from Java code will either return an arbitrary value, have no visible effect, or cause an unspecified exception to be thrown.".

    No copying there.

  • New<Primitive>Array: only arguments are JNIEnv * and length, so there is nothing to copy.

  • Set<Primitive>Array: "A family of functions that copies back a region of a primitive array from a buffer."

user207421
  • 305,947
  • 44
  • 307
  • 483
  • You should add that the developers should do `GetDirectBufferAddress` to get a `pointer` to the buffer. – Brandon Mar 01 '15 at 09:25
  • @Brandon Why should the developer call that method, when he supplied the buffer in the first place? – user207421 Mar 01 '15 at 09:31
  • If I understand your answer NewDirectByteBuffer doesnt make a copy of the buffer, so in the c++ side I need to make sure that the buffer is allocated on the heap and not the stack in order for the java side to read a valid memory. Is this correct? The example I wrote is wrong and I sould do a memcopy of the string to a char* buffer. setByteArray does create a copy so in this case i can use a string on the stack without allocating on the heap.. is this right? – Shay Mar 01 '15 at 09:48
  • DirectByteBuffer is actually slow for small data. Just create the byte[] and be safe. – Wheezil Mar 01 '15 at 21:28