I am new to C++ and come from non-CS background. Hence kindly excuse me if this question is silly or has been answered before.
I have a string in c++, language is Telugu.
std::string str = "ఉంది"; // (it means exists; pronounced as Vundi)
std::string substring = str.substr(0,3);
The above substring would be "ఉ" (pronounced as Vu) and its unicode hex value is 0C09.
How can i get the value 0C09 from substring? The purpose is to check if the substring is in the valid range for Telugu (0C00–0C7F).
I have seen other questions they apply to obj-c , java, php, c# etc. I am looking specifically for c++ using std::string.
As per the comment I have read the article at joelonsoftware.com/articles/Unicode.html.
Let me update my question with more information. I am using Fedora 19 x86_64 and encoding is UTF-8. The console is able to display the text properly.
As per the article, if I understand correctly ASCII is single byte character and unicode is multibyte character. The above code sample reflects that, here it is 3 bytes in length for each unicode character. Other than talking about UTF-8/ text encoding and multibyte characters, this article offers no practical help in detecting the language of unicode string.
May be I should rephrase my question:
How can I detect a language for unicode string in C++?
Thanks in advance for help.