1

In the JNI document, GetStringUTFChars(), which converts a java string jstring to c++ const char*, will return an optional jboolean flag indicating whether it performs copy or not within the function call. However, the document does not mention when GetStringUTFChars() will or will not perform copy. My questions are:

  1. Under which condition GetStringUTFChars() will perform a copy?
  2. Is there any way to avoid copy in GetStringUTFChars()?
  3. If the answer to question 2. is true, is it suggested to avoid such copy?
keelar
  • 5,814
  • 7
  • 40
  • 79
  • It's off-topic but be sure to understand "modified UTF-8" if you are going to use `GetStringUTFChars`. The return value is the more important part of that function's interface contract. – Tom Blodget Mar 23 '14 at 03:35

2 Answers2

3

According to the book "Essential JNI Java Native Interface", it is the implementation of the JVM that decides whether a copy is done or not. So no, you have no control over the copying.

PaulMcKenzie
  • 34,698
  • 4
  • 24
  • 45
  • Thank you very much for your answer :) +1 for valuable quotation! Would you recommend that book? – keelar Mar 22 '14 at 08:21
  • 1
    I recommend it. Author is "Rob Gordon". – PaulMcKenzie Mar 22 '14 at 12:30
  • 1
    @PaulMcKenzie See that one, however: http://stackoverflow.com/questions/5859673/should-you-call-releasestringutfchars-if-getstringutfchars-returned-a-copy – manuell Mar 22 '14 at 13:00
  • Thanks for the info. The Oracle web page has expired though that explains this. Admittedly, it seems to me to be getting more difficult to find good docs on the JNI, sort of like it being a "lost art". – PaulMcKenzie Mar 22 '14 at 16:51
  • 1
    So, unless you are doing some kind of profiling, pass `NULL`. It's hard to see why a JVM would have a "modified UTF-8" version of a string hanging around—there is no other use for it than JNI, and even there it's applicability is specious. I suppose if you called `GetStringUTFChars` on the same string or same string value multiple times, a JVM might know it hasn't yet overwritten the conversion buffer (internally strings are UTF-16). The question is more applicable to `GetStringChars` (standard UTF-16). Perhaps the signature was simply made to follow the same pattern. – Tom Blodget Mar 23 '14 at 03:33
3

It will always be up to the JVM implementation whether to copy the string or not, but you can reduce the likelihood of a copy being made by instead using GetStringCritical instead of GetStringUTFChars. But do note that GetStringCritical returns a UTF-16-encoded string (as opposed to the UTF-8 encoded string returned by GetStringUTFChars), so you may need to do a conversion into your desired encoding.

In my experience (Oracle 1.8.0_51-b16 on OS 10.11), GetStringUTFChars always returns a copy and GetStringCritical never returns a copy, which isn't surprising since Java stores Strings internally as UTF-16, and so extracting UTF-8 will probably require making a copy of the data.

See: https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/functions.html#GetStringCritical_ReleaseStringCritical

marcprux
  • 9,845
  • 3
  • 55
  • 72