I am trying to understand the UTF-8 standard and based on the description that follows the below image, wikipedia mentions that the first 128 characters (2^7) are reserved for the ASCII characters.
I want to pass a string as a query parameter to a Cosmos db with sql api which has query size limit of 256 KB, and the db threw an exception because the size exceeded the limit.
Furthermore, when i printed the default character set used in my java 8 app with System.defaultCharset()
i get UTF-8
as the output - which also happens to be the value of "file.encoding
" property.
BUT when i print all properties in my spring boot app, i also get the below:-
sun.jnu.encoding=Cp1252
file.encoding.pkg=sun.io
sun.io.unicode.encoding=UnicodeLittle
According to this answer:- What is the default encoding of the JVM? There are three "default" encodings:
file.encoding:
System.getProperty("file.encoding")
java.nio.Charset:
Charset.defaultCharset()
And the encoding of the InputStreamReader:
InputStreamReader.getEncoding()
Other users in the same answer suggest:
java -XshowSettings
and It's going to be locale-dependent.
I'm unable to come to a conclusion as to which encoding is being picked up while execution ? Is there a way to check which encoding is in play at runtime ? Do the above mentioned properties influence/override UTF-8 encoding during build/execution ?