Please help me understand how java stores strings and char arrays.
In java Character.SIZE
returns 16
and most of the answers on stackoverflow and web state that character in java is 16 bits (Obviously, since it uses UTF-16 internally), however UTF-16 can't fit everything in 2 bytes. For example Chinese.
char c = '的';
System.out.println(Arrays.toString(Character.toString(c).getBytes(StandardCharsets.UTF_16)));
This piece of code prints [-2, -1, 118, -124]
, meaning a char was 4 bytes long. Does that mean that Strings in java that consist of char[]
array, take 4 bytes for every char. That'd take too much space, so I assume that's not what happens. It must be that char has variable length. If that's so, it's impossible to store char[] as a long list of bytes in ram without specifying length of each individual char first. And that'd take too much space also.
So what's the actual size of a char in Java. And how is it stored in ram if it has variable length?