I'm tying to get the character coding type of a json string from jsoncpp: UTF-8, ANSI or UNICODE? How to get character coding type of a json::value? Thanks advance!
2 Answers
Any string is just a sequence of bytes, conforming, may be, to some basic rules (null terminators, prohibited symbols for json, etc). There is no magic way to determine which encoding was used to form a string, because encoding is just a way to represent string binary data. So json string encoding should be either specified by the json issuer (in documentation perhaps), or information about it should be a part of a json (if for some reason different strings has a different encoding).

- 5,720
- 3
- 28
- 33
Determining the character encoding of a string is quite complicated. See this SO answer for choosing the right application.
Apache Tika - the content analysis toolkit is maybe one of the most advanced, according to the following quote:
The Apache Tika™ toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF). All of these file types can be parsed through a single interface, making Tika useful for search engine indexing, content analysis, translation, and much more. You can find the latest release on the download page.
Analyzing a JSON string could be done with each of these libraries resulting in a (probable) CharSet usable for further processing.