Are Java Strings stored differently in heap based on how they are being constructed?

Question

I'm working on a code base which uses JNI techniques for modeling native methods.

Here is the segment of the native peer method used for java.lang.String#equals(Object)

@MJI
public boolean equals__Ljava_lang_Object_2__Z (MJIEnv env, int objRef, int argRef) {

    ElementInfo s1 = heap.get(objRef); // this
    ElementInfo s2 = heap.get(argRef);

    Fields f1 = heap.get(s1.getField("value")).getFields();
    Fields f2 = heap.get(s2.getField("value")).getFields();

    char[] c1 = ((CharArrayFields) f1).asCharArray();
    char[] c2 = ((CharArrayFields) f2).asCharArray();

This works fine on Java 8. But in Java 9 and later, the value returned for the value field of the String, is either char[] or byte[]

I expect it to return a byte[] array since the changes made in JEP 254: Compact Strings

So for instance:

char[] chars = new char[] {'a','b','c', 'd'};
String str1 = new String(chars))
"str2".equals(str1);

here I get a char array for str1 and byte array for "str2" in the peer method. Is this because that Strings are stored differently in heap?

FYI:

Here is the code that I'm actually working on. I'm trying to make it work with Java 9 and later:

jpf-core/src/peers/gov/nasa/jpf/vm/JPF_java_lang_String.java#L166-L200

As you may see the valueField there is cast to CharArrayFields. But when running on Java "10.0.1", valueField is sometimes a CharArrayFields, and sometimes a ByteArrayFields.

Your question isn't quite clear. You said that you *expect* a `byte[]`, and you apparently *get* a `byte[]`. In your example, the type of `str1` *isn't* a `String`, it's actually inherently a `char[]`. — chrylis -cautiouslyoptimistic-, Jul 30 '18 at 06:09
Additionally, can you confirm an exact JRE version? The code in OpenJDK 10 recodes if you use `new String(char[])`. — chrylis -cautiouslyoptimistic-, Jul 30 '18 at 06:26
Additionally, if you meant `String str1 = new String(new char[]{'a', 'b', 'c', 'd'}); String str2 = "abcd"; System.out.println(str1.equals(str2));`.. both the string are represented as **`byte[]`** to what I could debug using an IDE. — Naman, Jul 30 '18 at 06:30
it's hard to answer without knowing where these fields come from, and the link does not help - too complex. I suspect that JNI is *transforming* the `byte[]` to `char[]` depending how the bytes are encoded... the String source has no `char[]` field(Java 10.0.2) — user85421, Jul 30 '18 at 06:46
can you please provide 1 string examples for which you are getting char array? — DhaRmvEEr siNgh, Jul 30 '18 at 06:52
@DhaRmvEErsiNgh he already did : "`String str1 = new String(chars))` ... here I get a char array for str1" — user85421, Jul 30 '18 at 06:53
No, in both cases a string is stored as a byte array in LATIN1 encoding. — apangin, Jul 30 '18 at 06:55
@DhaRmvEErsiNgh you did not ask about how it is being stored, but "... examples for which you are getting char array".... and there seems to be some steps (JNI, MJI, ...) inbetween — user85421, Jul 30 '18 at 06:59
Also the char array being returned is correct. So I'll check to see if JNI is transforming the byte array in somewhere. — Gayan Weerakutti, Jul 30 '18 at 07:01
All right, this question is now well-defined and weird enough it gets a +1. — chrylis -cautiouslyoptimistic-, Jul 30 '18 at 07:38
@CarlosHeuberger Thanks for the tip. `value` field of Strings are constructed internally as a char array. It took some digging to find out. You may convert these to an answer. — Gayan Weerakutti, Jul 30 '18 at 08:50
@reversiblean untill java 8 value is char array but from Java 9 value is byte[] . so just want to know if you are looking at String class in JDK 8 or JDK 9? — DhaRmvEEr siNgh, Jul 30 '18 at 08:58
@DhaRmvEErsiNgh It's Java 9. The project I'm working on has its own Heap implementation. It assumes `value` field to be a char array. Hence causing problems. It took me a while to figure it out. — Gayan Weerakutti, Jul 30 '18 at 10:31
The representation of the Java strings is the same for `"str2"` and `"abcd"`, regardless of how they are constructed. So this is not an issue with Java, but that library you’re using. Besides that, what’s the point of re-implementing what `java.lang.String` already does, besides lots of work and getting inconsistencies like in this question? — Holger, Jul 30 '18 at 15:12
@Holger To my understanding, one reason that it models classes is to be able to inspect/trace native methods. — Gayan Weerakutti, Jul 30 '18 at 15:52

Are Java Strings stored differently in heap based on how they are being constructed?

0 Answers0