17

Is it possible to concat two string literals using a constexpr? Or put differently, can one eliminate macros in code like:

#define nl(str) str "\n"

int main()
{
  std::cout <<
      nl("usage: foo")
      nl("print a message")
      ;

  return 0;
}

Update: There is nothing wrong with using "\n", however I would like to know whether one can use constexpr to replace those type of macros.

Micha Wiedenmann
  • 19,979
  • 21
  • 92
  • 137
  • 3
    What's wrong with `"usage: foo\n" "print a message\n"`? – R. Martinho Fernandes Nov 08 '12 at 15:39
  • 1
    Probably best to use `std::endl` rather than `\n` – Douglas Leeder Nov 08 '12 at 15:42
  • 1
    @R.MartinhoFernandes Or even `"usage: foo\nprint a message\n"`? – James Kanze Nov 08 '12 at 15:43
  • 13
    @Douglas probably not. If you want to print a newline, why would you print a newline *and* flush? – R. Martinho Fernandes Nov 08 '12 at 15:43
  • @DouglasLeeder In a larger program, yes, it's probably better. Here, it makes absolutely no difference. – James Kanze Nov 08 '12 at 15:44
  • @R.MartinhoFernandes So that you can see roughly how far you've gotten if the program crashes. – James Kanze Nov 08 '12 at 15:44
  • 1
    @R.MartinhoFernandes nothing is wrong with that. However I would like to understand whether constexpr could be used to replace those kind of macros. – Micha Wiedenmann Nov 08 '12 at 15:46
  • 10
    @DouglasLeeder: `std::endl` is overused when you just want `'\n'`. So I don't think `std::endl` should be used in place of `'\n'`. – Nawaz Nov 08 '12 at 15:51
  • 1
    @DouglasLeeder: No, that doesn't make more sense. Strings are useful for more than insertion into an `ostream`. – Ben Voigt Nov 08 '12 at 15:57
  • @JamesKanze flushing often costs a lot in performance. For things you want to see in case of a crash such as error messages you can use a separate output channel like std::cerr. There's rarely a need to use std::endl; it's very much overused in introductory C++ materials. – bames53 Nov 08 '12 at 15:59
  • 1
    @bames53 _If_ the purges cause a performance problem, you do change to `'\n'`. But `std::endl` is the default. (In my own code, if I'm outputting several lines with no intervening operations, I'll use `'\n'` on all but the last. But I would consider this as an "advanced technique". Beginners should use `std::endl`, period, until they understand the issues well enough to make a knowledgeable choice.) – James Kanze Nov 08 '12 at 16:21
  • @Douglas : See also this question: [What is the C++ iostream endl fiasco?](http://stackoverflow.com/q/5492380/636019) – ildjarn Nov 08 '12 at 20:11
  • @JamesKanze using `std::endl` in favor of `\n` defeats the whole point about buffered streams, its similar to introducing namespaces but then invoking `using namespace std;`. People should rely on streams doing the right (tm) thing and should not flush them, unless they now why the want to flush. – Micha Wiedenmann Nov 09 '12 at 10:47
  • @MichaWiedenmann People who understand buffering will mix `\n` and `std::endl` as most appropriate. People who don't understand buffering (and it's generally _not_ the first thing you explain when teaching C++) should use `std::endl` by default, on the principle of least surprise. – James Kanze Nov 09 '12 at 11:45
  • @James : And then those people post questions on SO asking why on earth their code is so slow. ;-] – ildjarn Nov 09 '12 at 21:37
  • See the code in my answer here: http://stackoverflow.com/questions/15858141/conveniently-declaring-compile-time-strings-in-c/15902804#15902804 – Átila Neves Apr 10 '13 at 07:13

5 Answers5

15

A little bit of constexpr, sprinkled with some TMP and a topping of indices gives me this:

#include <array>

template<unsigned... Is> struct seq{};
template<unsigned N, unsigned... Is>
struct gen_seq : gen_seq<N-1, N-1, Is...>{};
template<unsigned... Is>
struct gen_seq<0, Is...> : seq<Is...>{};

template<unsigned N1, unsigned... I1, unsigned N2, unsigned... I2>
constexpr std::array<char const, N1+N2-1> concat(char const (&a1)[N1], char const (&a2)[N2], seq<I1...>, seq<I2...>){
  return {{ a1[I1]..., a2[I2]... }};
}

template<unsigned N1, unsigned N2>
constexpr std::array<char const, N1+N2-1> concat(char const (&a1)[N1], char const (&a2)[N2]){
  return concat(a1, a2, gen_seq<N1-1>{}, gen_seq<N2>{});
}

Live example.

I'd flesh this out some more, but I have to get going and wanted to drop it off before that. You should be able to work from that.

Xeo
  • 129,499
  • 52
  • 291
  • 397
  • I also considered this approach, but used another one because of the "implementation quantities" (Annex B). Though I'm not absolutely sure, I think this limits the length of strings you can work on to 256 (or 1024) chars, whereas a string literal itself can be >65k chars long. – dyp Nov 08 '12 at 18:38
  • @Dyp (and Xeo): this suffers from the same problem as DyP's clever solution, which is that while it produces the expected output, it actually creates the strings at run-time. In order to get it not to do that, as far as I know, you have to do something like `static const auto s = _call to clever constexpr_`. I compiled both of these with clang 3.2 and g++ 4.7.2 (which 'sorry's on DyP's) to look at the assembly code generated. – rici Nov 08 '12 at 19:03
  • @DyP, I was just rereading that section, in fact. Afaics, it *allows* compilers to do constant initialization of temporaries, but it certainly doesn't require them to do so, and neither clang nor gcc does. However, it is quite possible that other text in the standard would also get in the way of constant initialization (beyond just proving that the restrictions in 3.6.2/3 don't apply, which might in itself be tricky). – rici Nov 08 '12 at 19:19
  • @Xeo when I turn the fcts constexpr and assign `constexpr auto s = concat(...);` it does not compile on clang 3.1 ("read of uninitialized object") but I don't understand why. When I add an constexpr ctor to `gen_seq<0, Is...>`, it compiles fine. – dyp Nov 08 '12 at 20:57
  • @DyP: Derp, I actually forgot to make them `constexpr`. Fixed the code and it compiles fine on GCC 4.7.2. Clang 3.1 prob has a bug with `constexpr` here and 3.2 should compile just as fine. – Xeo Nov 09 '12 at 03:38
  • @xeo, it compiles fine on clang 3.2, which surprised me by putting the concatenated string in .rodata (and then calling strlen on it to get its length!). gcc-4.7.2 also compiles it, but it constructs the string at runtime (`movb $104, (%rsp); movb $101, 1(%rsp)...` I noticed that you changed `main` in liveworkspace to store the result of concat in a variable, but you didn't make the variable static as indicated in my comment above. If the variable is not static, the compiler is not required to do constant initialization. (Making the change gets gcc to do the concat at compile time.) – rici Nov 09 '12 at 06:11
  • @rici: Actually, the compiler should be. `constexpr` variables require compile-time evaluation of the initializer. – Xeo Nov 09 '12 at 06:15
  • @xeo: 7.1.5(9). constexpr variable declarations require that the initialization use only constant expressions, but they don't require the variable to be constant-initialized (3.6.2(2)). Only variables of static or thread storage need to be constant-initialized. The compiler may constant-initialize (as clang 3.2 does in this case) but doesn't require it, so gcc 4.7.2 is not, imo, incorrect. – rici Nov 09 '12 at 06:27
  • @rici: That doesn't make sense... what if you wanted to use a `constexpr` variable as a template argument right in the next line? – Xeo Nov 09 '12 at 06:31
  • @xeo: using the `constexpr` variable as a template argument is independent from how it is initialized at run-time. You can use a `static constexpr` member at compile time without ever odr-using it, so that it doesn't even exist at run-time. (I've been caught out by that one; if you actually use it at run-time, it needs to be defined.) – rici Nov 09 '12 at 06:46
  • @rici: `constexpr size_t size = calc_size(); array a;` -- `size` can't possibly be initialized at runtime. – Xeo Nov 09 '12 at 06:57
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/19323/discussion-between-rici-and-xeo) – rici Nov 09 '12 at 06:58
  • @xeo: Ok, then. In your example, `size` might not even exist at runtime, never mind be initialized. That's more important for larger objects, though. It's an oversimplification, but I would say that `constexpr` means that the object exists at compile-time; and `static` means that the object exists throughout the entire run-time. The two are completely independent. Without `static`, the lifetime (and storage consumption) of the object might be limited to the lifetime of its scope, which in the case of a large array might well be a good thing. – rici Nov 09 '12 at 15:08
  • 8
    The link to the example appears to be broken – NathanOliver Oct 14 '16 at 13:07
1

At first glance, C++11 user-defined string literals appear to be a much simpler approach. (If, for example, you're looking for a way to globally enable and disable newline injection at compile time)

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
1
  • You cannot return a (plain) array from a function.
  • You cannot create a new const char[n] inside a constexpr (§7.1.5/3 dcl.constexpr).
  • An address constant expression must refer to an object of static storage duration (§5.19/3 expr.const) - this disallows some tricks with objects of types having a constexpr ctor assembling the array for concatenation and your constexpr fct just converting it to a ptr.
  • The arguments passed to a constexpr are not considered to be compile-time constants so you can use the fct at runtime, too - this disallows some tricks with template metaprogramming.
  • You cannot get the single char's of a string literal passed to a function as template arguments - this disallows some other template metaprogramming tricks.

So (as far as I know), you cannot get a constexpr that is returning a char const* of a newly constructed string or a char const[n]. Note most of these restrictions don't hold for an std::array as pointed out by Xeo.

And even if you could return some char const*, a return value is not a literal, and only adjacent string literals are concatenated. This happens in translation phase 6 (§2.2), which I would still call a preprocessing phase. Constexpr are evaluated later (ref?). (f(x) f(y) where f is a function is a syntax error afaik)

But you can return from your constexpr fct an object of some other type (with a constexpr ctor or that is an aggregate) that contains both strings and can be inserted/printed into an basic_ostream.


Edit: here's the example. It's quite a bit long o.O Note you can streamline this in order just to get an additional "\n" add the end of a string. (This is more a generic approach I just wrote down from memory.)

Edit2: Actually, you cannot really streamline it. Creating the arr data member as an "array of const char_type" with the '\n' included (instead of an array of string literals) uses some fancy variadic template code that's actually a bit longer (but it works, see Xeo's answer).

Note: as ct_string_vector (the name's not good) stores pointers, it should be used only with strings of static storage duration (such as literals or global variables). The advantage is that a string does not have to be copied & expanded by template mechanisms. If you use a constexpr to store the result (like in the example main), you compiler should complain if the passed parameters are not of static storage duration.

#include <cstddef>
#include <iostream>
#include <iterator>

template < typename T_Char, std::size_t t_len >
struct ct_string_vector
{
    using char_type = T_Char;
    using stringl_type = char_type const*;

private:
    stringl_type arr[t_len];

public:
    template < typename... TP >
    constexpr ct_string_vector(TP... pp)
        : arr{pp...}
    {}

    constexpr std::size_t length()
    {  return t_len;  }

    template < typename T_Traits >
    friend
    std::basic_ostream < char_type, T_Traits >&
    operator <<(std::basic_ostream < char_type, T_Traits >& o,
        ct_string_vector const& p)
    {
        std::copy( std::begin(p.arr), std::end(p.arr),
            std::ostream_iterator<stringl_type>(o) );
        return o;
    }
};

template < typename T_String >
using get_char_type =
    typename std::remove_const < 
    typename std::remove_pointer <
    typename std::remove_reference <
    typename std::remove_extent <
        T_String
    > :: type > :: type > :: type > :: type;

template < typename T_String, typename... TP >
constexpr
ct_string_vector < get_char_type<T_String>, 1+sizeof...(TP) >
make_ct_string_vector( T_String p, TP... pp )
{
    // can add an "\n" at the end of the {...}
    // but then have to change to 2+sizeof above
    return {p, pp...};
}

// better version of adding an '\n':
template < typename T_String, typename... TP >
constexpr auto
add_newline( T_String p, TP... pp )
-> decltype( make_ct_string_vector(p, pp..., "\n") )
{
    return make_ct_string_vector(p, pp..., "\n");
}

int main()
{
    // ??? (still confused about requirements of constant init, sry)
    static constexpr auto assembled = make_ct_string_vector("hello ", "world");
    enum{ dummy = assembled.length() }; // enforce compile-time evaluation
    std::cout << assembled << std::endl;
    std::cout << add_newline("first line") << "second line" << std::endl;
}
dyp
  • 38,334
  • 13
  • 112
  • 177
  • The enum isn't necessary; `static constexpr auto` will do it. In fact, `static const auto` will make it constant-initialized if possible. But neither of those will let the second std::cout line act in the same way as the macro in the OP, not even to the extent of optimizing add_newline("first line") into a compile-time literal. – rici Nov 08 '12 at 19:56
  • @rici Sry, but I still don't get why `static constexpr` is sufficient. After all, this is a block-scope static variable and therefore, 6.7/4 holds ("An implementation is permitted.."). Maybe a chat (cannot figure out how to start it o.O)? – dyp Nov 08 '12 at 20:02
  • @DyP: 6.7/4 says "Constant initialization (3.6.2) of a block-scope entity with static storage duration, if applicable, is performed before its block is first entered." So if constant initialization is applicable, it's applied. The "An implementation is permitted... of *other* block-scope variables..." statement applies to initializations for which the conditions in 3.6.2 *do not* apply. At least, that's my interpretation, but like I said in the disclaimer, IANALL. – rici Nov 08 '12 at 23:05
1
  1. Yes, it is entirely possible to create compile-time constant strings, and manipulate them with constexpr functions and even operators. However,

  2. The compiler is not required to perform constant initialization of any object other than static- and thread-duration objects. In particular, temporary objects (which are not variables, and have something less than automatic storage duration) are not required to be constant initialized, and as far as I know no compiler does that for arrays. See 3.6.2/2-3, which define constant initialization, and 6.7.4 for some more wording with respect to block-level static duration variables. Neither of these apply to temporaries, whose lifetime is defined in 12.2/3 and following.

So you could achieve the desired compile-time concatenation with:

static const auto conc = <some clever constexpr thingy>;
std::cout << conc;

but you can't make it work with:

std::cout <<  <some clever constexpr thingy>;

Update:

But you can make it work with:

std::cout << *[]()-> const {
             static constexpr auto s = /* constexpr call */;
             return &s;}()
          << " some more text";

But the boilerplate punctuation is way too ugly to make it any more than an interesting little hack.


(Disclaimer: IANALL, although sometimes I like to play one on the internet. So there might be some dusty corners of the standard which contradicts the above.)

(Despite the disclaimer, and pushed by @DyP, I added some more language-lawyerly citations.)

rici
  • 234,347
  • 28
  • 237
  • 341
  • Could you point out for me where the Standard says temporaries have dynamic storage duration? Cannot find it.. – dyp Nov 08 '12 at 19:32
  • I'd add 6.7/4, as we deal with block-scope variables here. And this only _permits_ an implementation to do "early" init of local static-storage-duration variables (it's required to be initialized before first block entry). – dyp Nov 08 '12 at 19:36
  • @DyP: 12.2/3 "Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created." There are some exceptions following that, but nothing which would allow the temporary to become permanent. – rici Nov 08 '12 at 19:40
  • For me, dynamic storage duration is (erroneously?) related to `new` and `delete` - where I would expect (effectively) all compilers to put a temporary on the stack. – dyp Nov 08 '12 at 19:43
  • @DyP: quite right, I actually meant "automatic", but 12.2/3 seems to say that temporaries have even shorter lives than that. I can't find a phrase to describe temporary object lifetimes other than that, so I edited the response accordingly. Regardless of the standard, which probably does allow a compiler to constant initialize a constexpr temporary -- otherwise user-defined string literals would be a lot less interesting -- I'm pretty sure that compilers don't actually do it, except maybe for user-defined string literals, which are not yet widely implemented. – rici Nov 08 '12 at 19:50
  • I've edited my answer as I recalled you can enforce evaluation of constexpr e.g. in template arguments or wherever a constant expression is required. Edit: it seems from the assembly it still doesn't work? What's going on? – dyp Nov 08 '12 at 19:52
  • String literals are always lvalues, so I don't see how they could ever be used as constant expressions. (_Character_ literals are a different story -- see e.g. `boost::mpl::string<>`.) – ildjarn Nov 09 '12 at 06:04
  • @ildjarn, what's wrong with lvalues? http://liveworkspace.org/code/b7a79af9fab4cb4b72deb5e93c36aba2 clearly shows a string literal used as a constant expression. (Actually, a character from the string literal, but it's definitely a constexpr function with a string literal as an argument.) – rici Nov 09 '12 at 06:22
  • The fact that a string literal's _characters_ (as previously mentioned) and _size_ are constant expressions has no reflection on the string literal itself. – ildjarn Nov 09 '12 at 06:30
  • @ildjarn, I don't use the string literal's size. Read that code more carefully. The 32 comes from the ascii value of `' '`. – rici Nov 09 '12 at 06:31
  • Read that code more carefully -- `S` has its value because of the string literal's size... – ildjarn Nov 09 '12 at 06:33
  • @ildjarn, but so what? The point is that the size of the std::array (which must be a constant expression) comes from a function whose argument is a string literal. – rici Nov 09 '12 at 06:35
  • @ildjarn, perhaps you're reacting to my statement that compile time strings are possible. I stand by it, but the strings' types will probably be more like boost:mpl::string's (I'm guessing). They'll still be character arrays, NUL-terminated if desired, and initializable from string-literals. Anyway, if I can get the length and every individual character out of a string literal, that's good enough for me. I don't know what aspect of a string literal non-const-expressiveness would be, then. – rici Nov 09 '12 at 06:39
  • Yes, I was referring to that statement. But, I'm glad we had this discussion, as your particular wording gave me a good idea for an approach to solving this. Thanks :-] – ildjarn Nov 09 '12 at 06:44
  • @ildjarn: I was planning to clean this up a bit, but what the heck. http://liveworkspace.org/code/c54e3506022c53bf537428c2c26c0502 – rici Nov 09 '12 at 06:51
0

Nope, for constexpr you need a legal function in the first place, and functions can't do pasting etc. of string literal arguments.

If you think about the equivalent expression in a regular function, it would be allocating memory and concatenating the strings - definitely not amenable to constexpr.

Useless
  • 64,155
  • 6
  • 88
  • 132