7

I have a template class in C++ which takes as a char_type template parameter the character type, such as char, wchar_t, char32_t, etc... The class then use std::basic_string<char_type> in the code.

Then somewhere in the class I fill a table of escaping sequences such as "&amp;". This does not work as depending on the template character type, we would need to use "&amp;", L"&amp;", U"&amp;"...

Is there a way to avoid specializing the template functions for initializing the table, for instance with some standard function for converting string litterals?

As these are escaping sequences, they do not contain anything else than ASCII characters.

galinette
  • 8,896
  • 2
  • 36
  • 87

4 Answers4

4

I would do the following:

template <typename char_type, size_t LENGTH>
constexpr std::basic_string<char_type> literal(const char (&value)[LENGTH])
{
    using string = std::basic_string<char_type>;

    string result{};
    result.reserve(LENGTH);

    std::copy(std::begin(value), std::end(value), std::back_inserter(result));

    return result; // rvo
}

You can use it this way:

// Table of escaping sequences
std::basic_string<char_type> escaping_sequences[] =
{
    literal<char_type>("&amp"),
    literal<char_type>("&foo"),
    literal<char_type>("&bar"),
    ...
}

I've tested it in Ideone:

literal<  char  >("test") // result: std::string
literal<char32_t>("test") // result: std::basic_string<char32_t, std::char_traits<char32_t>, std::allocator<char32_t> >
literal<char16_t>("test") // result: std::basic_string<char16_t, std::char_traits<char16_t>, std::allocator<char16_t> >

Is untested for all the char types but hope it helps.

Edit 1

My bad, I just noticed that galinette almost answered the same as me before I did. The only difference between my code and the one from galinette is that I'm allocating the resulting string once with reserve instead of using the automatic allocation of push_back counting the number of characters at compile time, due to the use of LENGTH as a template parameter.

Edit 2

It is possible to avoid the final null character issue by substracting 1 to the end iterator:

template <typename char_type, size_t LENGTH>
constexpr std::basic_string<char_type> literal(const char (&value)[LENGTH])
{
    using string = std::basic_string<char_type>;

    string result{};
    result.reserve(LENGTH - 1);

    std::copy(std::begin(value), std::end(value) - 1, std::back_inserter(result));

    return result; // rvo
}

Or, using std::copy_n instead of std::copy:

template <typename char_type, size_t LENGTH>
constexpr std::basic_string<char_type> literal(const char (&value)[LENGTH])
{
    using string = std::basic_string<char_type>;

    string result{};
    result.reserve(LENGTH - 1);

    std::copy_n(std::begin(value), LENGTH - 1, std::back_inserter(result));

    return result; // rvo
}
Community
  • 1
  • 1
PaperBirdMaster
  • 12,806
  • 9
  • 48
  • 94
  • I also used reserve! But your solution is better as it does not count the number of characters at run time, due to the use of LENGTH as a template parameter which is a good idea. – galinette Sep 29 '15 at 14:41
  • @galinette what a day I'm having... you're right, you also are using `reserve` but I am having a day full of missreads :'( – PaperBirdMaster Sep 29 '15 at 15:54
  • Many thanks for this elegant solution! Just a minor comment after using this code: I ran into the issue that the above code also includes the final null character into the result (i.e. try to evaluate literal("test").length(), this gives 5 instead of the expected 4). I'm not sure what's the best way to solve this. The solution by @galinette does not suffer from this problem as in that code, the variable s is set to the length excluding the null character. – Matthias C. M. Troffaes May 10 '17 at 10:45
  • @MatthiasC.M.Troffaes it should be easy to fix this null character issue reserving `LENGTH - 1` characters and using `std::copy_n` instead of `std::copy` (I'll edit the answer). – PaperBirdMaster May 10 '17 at 10:51
  • That's brilliant! Also many thanks for the quick response - wasn't expecting that on such on old answer!! – Matthias C. M. Troffaes May 11 '17 at 11:51
  • @MatthiasC.M.Troffaes notifications exists for this reason :) – PaperBirdMaster May 11 '17 at 12:38
  • Is this a valid use of constexpr? I get compiler errors ("'result' declaration is not allowed in 'constexpr' function body), but I'm not sure if this was added in a later standard. – BTownTKD Feb 05 '18 at 16:22
  • @BTownTKD Which compiler and C++ standard version you're using? – PaperBirdMaster Feb 05 '18 at 16:42
  • 1
    MSVC 2015, which apparently has "some" c++14 features, but not "Extended constexpr." Perhaps that's why. https://msdn.microsoft.com/en-us/library/hh567368.aspx – BTownTKD Feb 05 '18 at 17:05
2

The best way is maybe to define conversion function ourselves, as converting ASCII to UTF8/16/32 is a straightforward cast on the char types

template<typename char_type>
std::basic_string<char_type> cvtASCIItoUTFX(const char * litteral)
{
    //We could define a faster specialization in case char_type is char

    size_t s = strlen(litteral);

    std::basic_string<char_type> result;
    result.reserve(s);
    for(size_t i=0;i<s;++i)
    {
        result.push_back((char_type)litteral[i]);
    }

    return result;
}
galinette
  • 8,896
  • 2
  • 36
  • 87
1

As these are escaping sequences, they do not contain anything else than ASCII characters.

Is there a way to avoid specializing the template functions for initializing the table, for instance with some standard function for converting string litterals?

No, because the standard doesn't have any conversion functions that stick to such specific subsets.

I'd recommend just using an external generator for the table, or if you really want to stay within C++, to use macros.

Community
  • 1
  • 1
R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
  • Since an ASCII string is a valid UTF8-string, and since c++11 has utf8 to utf16 conversions, that's not completely true. – galinette Sep 29 '15 at 10:29
  • If you read my claim properly, you'll see that it makes no sense to use a superset to refute it. But that aside, there are not enough conversion functions for this in the standard. You'll have to do it by hand. – R. Martinho Fernandes Sep 29 '15 at 10:37
0

This answer only works for non-string (i.e. number) literals

... because only those are expanded to template<char...> by the language.

Since I've spent a while on this, I figured I might as well post it here. Doesn't work with actual character literals because herp derp C++.

template<char16_t... str>
struct Literal16 {
    static constexpr char16_t arr[] = {str...};
    
    constexpr operator const char16_t*() { 
        return arr;
    }
};

template<char... str>
struct Literal8 {
    static constexpr char arr[] = {str...};
    
    constexpr operator const char*() { 
        return arr;
    }
};

template<char... str>
struct PolyLiteral {
    operator const char*() {
        return Literal8<str...>();
    }
    operator const char16_t*() {
        return Literal16<str...>();
    }  
};

template<char... str> PolyLiteral<str...> operator"" _poly() { return PolyLiteral<str...>(); }

int main() {
    const char* test = 123_poly;
    const char16_t* test2 = 123_poly;
}
Community
  • 1
  • 1
Bartek Banachewicz
  • 38,596
  • 7
  • 91
  • 135