0

I'd like to write function to create char/wchar_t strings given const char* ('narrow' string literals).

For example, something like

template<typename T>
std::basic_string<T> test(const char* str) {
    ///
}

So that I can use

std::string a = test<char>("haha");

or

std::wstring b = test<wchar_t>("哈哈"); // NOT L"哈哈"

to create strings based on template arguments.

I known it would be easy if the function argument is const T* str. But I can't figure it out when it is const char* str. And I think some conversion must be performed.

Thanks in advance.

Saddle Point
  • 3,074
  • 4
  • 23
  • 33
  • 2
    What is the encoding of `"哈哈"`? – ildjarn May 17 '18 at 09:02
  • @ildjarn: it does not really matters, provided the program *knows* how to process it. It is just a sequence of bytes... – Serge Ballesta May 17 '18 at 09:47
  • 1
    @SergeBallesta : If you don't know the encoding then how would one widen the characters into `wchar_t`..? – ildjarn May 17 '18 at 09:58
  • 1
    Your question can't be fully answered unless you specify (1) the input character encoding used in the `const char*` argument; (2) the output character encoding used in the `wstring` result; and (3) the version of C++ since there are standard classes for converting that were created in C++11 and deprecated in C++17, and the definition of wchar_t is platform-specific. – DodgyCodeException May 17 '18 at 11:06
  • @ildjarn It's utf-8... – Saddle Point May 18 '18 at 01:33
  • 1
    The encoding of a literal `"哈哈"` is very much implementation-defined, and may be UTF-8 or _any other_ narrow multibyte encoding. `u8"哈哈"` is always UTF-8, and is what you should use if you care about the encoding (as it appears you must). – ildjarn May 18 '18 at 01:45
  • See https://stackoverflow.com/questions/42946335/deprecated-header-codecvt-replacement – DodgyCodeException May 18 '18 at 12:21

1 Answers1

2

It not that hard, you have just to write explicit specializations for your template function. The major drawback here is that you will have to alway pass the type argument, because no template deduction can occur here: both specializations have exactly same parameters: one const char * parameter.

That being said, you can write:

template<typename T>
std::basic_string<T> test(const char* str) {
    ///
}

template <>
std::basic_string<char> test<char>(const char *str) {
    // generate a std::string
}

template <>
std::basic_string<wchar_t> test<wchar_t>(const char *str) {
    // generate a std::wstring
}

But you will have to use them that way:

std::string str = test<char>("haha");
std::wstring wstr = test<wchar_t>("哈哈");
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • 2
    Thanks for your answer. Suppose I want to return a `string/wstring` version of the `const char*` input, the answer seems work perfectly for input like "haha" if I write something like `std::basic_string(str, str + std::char_traits::length(str));` in the body of the `string` version. However, the `wstring` version with input "哈哈" seems not working. How can I correctly perform some conversion in this situation? – Saddle Point May 17 '18 at 09:56
  • @Edityouprofile: You are falling in the charset decoding problem. If you want to convert `"哈哈"` to `L"哈哈"` you need the encoding used and a method to convert that encoding to Unicode. – Serge Ballesta May 17 '18 at 11:59
  • "A method to convert that encoding to Unicode" - actually, even `L"哈哈"` might still need to be encoded; for example, on Windows, `wchar_t` is only 16 bits and `wstrings` are typically encoded in UTF-16. – DodgyCodeException May 17 '18 at 13:46
  • 1
    @DodgyCodeException You are right for the general case, but those characters are in the Basic Multilingual Plane and use only 16 bits. – Serge Ballesta May 17 '18 at 13:51