0

I understand that there are several character sets available.

  1. When client uses a different character set from server, how does the conversion works without data loss?

  2. Does Java uses default character set (as UTF-8/UTF-16) or takes it from OS?

  3. Also understand that, windows and Linux uses - CPxxxx and Servers mostly use ISOxxxx (using command - Charset.defaultCharset()). I was expecting to UTF-8/UTF-16. These character sets are not default in the systems? Do we need to mention it explicitly where ever we need?

Please clarify

Nayuki
  • 17,911
  • 6
  • 53
  • 80

1 Answers1

0

for #1 You don't manually do this, https://stackoverflow.com/a/655948 use a built in converter for you, and let it handle the edge cases.

for #2 A String represents a string in the UTF-16 format in which supplementary characters are represented by surrogate pairs from official javadoc

for #3 If you are using someone else's server, then you need to convert your input to match their expected values/encoding.

Community
  • 1
  • 1
mawalker
  • 2,072
  • 2
  • 22
  • 34
  • Thanks for the response. So , you mean that character set conversions happens internally between client and server, we don't need to bother about it. – Panayappan Swaminathan Dec 21 '15 at 08:59
  • When JAVA uses UTF16 (2 bytes) internally and Oracle DB is UTF8 (1 Byte). even here conversion happens in the DB from UTF16 to UTF8 before it stores in DB...is that correct? – Panayappan Swaminathan Dec 21 '15 at 09:00