I am trying to make a Java application and a VS C++ application communicate and send different messages to each other using Sockets. The only problem that I have so far - I am absolutely lost in their encodings.
By default Java uses UTF-8. This is as far as I am concerned a Unicode charset. In my VS project I have settings set to Unicode. Though for some reason when I debug my code I allways see my strings encoded as CP1252 in memory.
Furthermore if I try to use CP1252 in Java it works fine for English letters, but whenever I try some russian letters I get a 3f
byte for every letter.
If on other hand I try to use UTF-8 in Java - each English letter is 1 byte long, but every Russian - 2 bytes long. Isnt it a multibyte encoding?
Some docs on C++ say that std::string(char)
uses UTF-8 codepage, and std:wstring(wchar_t)
- UTF-16. When I debug my application I see CP1252 encoding for both of them, though wstring has empty bytes between each letter.
Could you please explain how encodings behave in both Java and C++ and how should I communicate my 2 apps?