I am using a std::string as a text buffer. Then, I am sure the data contained in that buffer is UTF-16 (i.e. it is really a std::wstring). How can I coerce a std::string into a std::wstring? The std::string is a misnomer, the data is really a wstring.
2 Answers
Consider using a std::vector<char>
instead of a std::string
. It's the correct container when you want "a contiguous sequence of bytes."
With a std::vector
source container, the code is rather straightforward, assuming you really just want to reinterpret the data (i.e., you really just want to treat the bytes as if they were a sequence of wchar_t
):
std::vector<char> v = get_my_wstring_character_data();
if (v.size() % sizeof (wchar_t) != 0)
throw std::runtime_error("Invalid wstring length");
std::wstring ws(reinterpret_cast<wchar_t*>(&v[0]),
reinterpret_cast<wchar_t*>(&v[0] + v.size()));
If your source is a std::string
, this same approach will work if you can guarantee that the implementation of std::string
you are using stores its characters contiguously. In practice, this is always the case.

- 348,265
- 75
- 913
- 977
-
"In practice, this is always the case." You don't know how many hours I've wasted on theory. Practice IS the library and greatly aids implementation. Thanks. – unixman83 Mar 26 '11 at 05:22
-
Do heed Jon's warning, though: the size of `wchar_t` and how it represents characters are both implementation-defined, so if you are writing code that needs to run on multiple platforms, you can't rely on it having a particular size and form. – James McNellis Mar 26 '11 at 05:26
-
`v.data()` is an pointer to a contiguous `char[]` which can be used portably instead of `&v[0]`. You of course don't solve the `wchar_t =?= UTF-16` issue with that. – MSalters Mar 28 '11 at 09:52
There is no guarantee that std::wstring
stores/interprets byte arrays as UTF-16 (although it happens to do that in Windows). Check out this question: std::wstring VS std::string
Therefore I would advise you to rethink the idea of constructing a std::wstring
from a UTF-16 encoded byte array unless you are sure your application will only ever be compiled with MSVC.
-
The particular code in question deals with Win32 API's which is why I chose `std::wstring` in the first place. I am a unix nut, didn't even know about wstring until in MSVC debugger ;) – unixman83 Mar 26 '11 at 05:24