253

Is there a C++ Standard Template Library class that provides efficient string concatenation functionality, similar to C#'s StringBuilder or Java's StringBuffer?

An̲̳̳drew
  • 13,375
  • 13
  • 47
  • 46

10 Answers10

212

The C++ way would be to use std::stringstream or just plain string concatenations. C++ strings are mutable so the performance considerations of concatenation are less of a concern.

with regards to formatting, you can do all the same formatting on a stream, but in a different way, similar to cout. or you can use a strongly typed functor which encapsulates this and provides a String.Format like interface e.g. boost::format

jk.
  • 13,817
  • 5
  • 37
  • 50
  • 92
    __C++ strings are mutable__: exactly. The entire reason `StringBuilder` exists is to [cover the inefficiency of Java's immutable basic String type](http://stackoverflow.com/questions/5234147/why-stringbuilder-when-there-is-string). In other words `StringBuilder` is patchwork, so we should be glad we don't need such a class in C++. – bobobobo Apr 16 '13 at 19:27
  • 71
    @bobobobo immutable strings have other benefits though, its horses for courses – jk. Apr 16 '13 at 19:29
  • 11
    Don't plain string concatenations create a new object, so the same problem as with immutability in Java? Consider all variables are strings in the following example: a = b + c + d + e + f; Isn't it going to call operator+ on b and c, then operator+ on the result and d, etc.? – Serge Rogatch Jun 24 '15 at 16:43
  • 2
    @Hoten, as I understand, move semantics will only save copying of the object when returning from `operator+`. However, new string object is still needed to be created inside `operator+` for storing the result of string concatenation. – Serge Rogatch Sep 30 '15 at 05:44
  • 3
    @jk I believe marking the std::string as const makes it immutable, so I'd call this a flexibility. – Jan Smrčina Feb 24 '16 at 18:39
  • @SergeRogatch I have seen the Java compiler avoid multiple object creation in a stack of concatenations like that by creating a StringBuilder, using append(), and then getting the result at the end with a toString(). – neuralmer Jun 20 '16 at 22:06
  • 17
    Hold on a minute people, the standard string class knows how to mutate itself but that does not mean the inefficiency is not there. As far as I know std::string cannot simply extend the size of its internal char*. That means mutating it in a way which requires more characters requires a reallocation and copying. It's no different than a vector of chars and it is certainly better to reserve the space you need in that case. – Trygve Skogsholm Jul 31 '16 at 00:36
  • 11
    @TrygveSkogsholm - it is no different than a vector of chars, but of course the "capacity" of the string can be larger than its size, so not all appends need a reallocation. In general strings will use an exponential growth strategy so appending still amortizes to a linear cost operation. That's different than Java's immutable Strings in which every append operation needs to copy all characters in both Strings to a new one, so a series of appends ends up as `O(n)` in general. – BeeOnRope Nov 18 '17 at 19:33
137

The std::string.append function isn't a good option because it doesn't accept many forms of data. A more useful alternative is to use std::stringstream; like so:

#include <sstream>
// ...

std::stringstream ss;

//put arbitrary formatted data into the stream
ss << 4.5 << ", " << 4 << " whatever";

//convert the stream buffer into a string
std::string str = ss.str();
Cole Tobin
  • 9,206
  • 15
  • 49
  • 74
Stu
  • 1,999
  • 1
  • 14
  • 11
55

NOTE this answer has received some attention recently. I am not advocating this as a solution (it is a solution I have seen in the past, before the STL). It is an interesting approach and should only be applied over std::string or std::stringstream if after profiling your code you discover this makes an improvement.

I normally use either std::string or std::stringstream. I have never had any problems with these. I would normally reserve some room first if I know the rough size of the string in advance.

I have seen other people make their own optimized string builder in the distant past.

class StringBuilder {
private:
    std::string main;
    std::string scratch;

    const std::string::size_type ScratchSize = 1024;  // or some other arbitrary number

public:
    StringBuilder & append(const std::string & str) {
        scratch.append(str);
        if (scratch.size() > ScratchSize) {
            main.append(scratch);
            scratch.resize(0);
        }
        return *this;
    }

    const std::string & str() {
        if (scratch.size() > 0) {
            main.append(scratch);
            scratch.resize(0);
        }
        return main;
    }
};

It uses two strings one for the majority of the string and the other as a scratch area for concatenating short strings. It optimise's appends by batching the short append operations in one small string then appending this to the main string, thus reducing the number of reallocations required on the main string as it gets larger.

I have not required this trick with std::string or std::stringstream. I think it was used with a third party string library before std::string, it was that long ago. If you adopt a strategy like this profile your application first.

iain
  • 10,798
  • 3
  • 37
  • 41
  • 14
    Reinventing the wheel. std::stringstream is the proper answer. See good answers below. – Kobor42 Apr 16 '13 at 07:31
  • 13
    @Kobor42 I agree with you as I point out on the first and last line of my answer. – iain Apr 16 '13 at 12:37
  • 1
    I don't think the `scratch` string really accomplishes anything here. The number of reallocations of the main string is largely going to be a function of it's final size, not the number of append operations, unless the `string` implementation is really poor (i.e., doesn't use exponential growth). So "batching" up the `append` doesn't help because once the underlying `string` is large it will only grow occasionally either way. On top of that it adds a bunch of redundant copy operations, and may _more_ reallocations (hence calls to `new`/`delete`) since you are appending to a short string. – BeeOnRope Nov 18 '17 at 19:30
  • @BeeOnRope I agree with you. – iain Nov 20 '17 at 07:55
  • i'm pretty sure `str.reserve(1024);` would be faster than this thing – hanshenrik Apr 25 '19 at 18:14
52

std::string is the C++ equivalent: It's mutable.

dan04
  • 87,747
  • 23
  • 163
  • 198
15

You can use .append() for simply concatenating strings.

std::string s = "string1";
s.append("string2");

I think you might even be able to do:

std::string s = "string1";
s += "string2";

As for the formatting operations of C#'s StringBuilder, I believe snprintf (or sprintf if you want to risk writing buggy code ;-) ) into a character array and convert back to a string is about the only option.

Andy Shellam
  • 15,403
  • 1
  • 27
  • 41
  • Not in the same way as printf or .NET's String.Format though, are they? – Andy Shellam Mar 17 '10 at 15:25
  • 1
    its a little disingenuous to say they are the only way though – jk. Mar 17 '10 at 16:07
  • 2
    @jk - they're the only way when comparing the formatting ability of .NET's StringBuilder, which is what the original question specifically asked. I did say "I believe" so I could be wrong, but can you show me a way to get StringBuilder's functionality in C++ without using printf? – Andy Shellam Mar 17 '10 at 16:41
  • updated my answer to include some alternative formatting options – jk. Mar 22 '10 at 10:28
9

Since std::string in C++ is mutable you can use that. It has a += operator and an append function.

If you need to append numerical data use the std::to_string functions.

If you want even more flexibility in the form of being able to serialise any object to a string then use the std::stringstream class. But you'll need to implement your own streaming operator functions for it to work with your own custom classes.

Dominik Grabiec
  • 10,315
  • 5
  • 39
  • 45
8

A convenient string builder for c++

Like many people answered before, std::stringstream is the method of choice. It works good and has a lot of conversion and formatting options. IMO it has one pretty inconvenient flaw though: You can not use it as a one liner or as an expression. You always have to write:

std::stringstream ss;
ss << "my data " << 42;
std::string myString( ss.str() );

which is pretty annoying, especially when you want to initialize strings in the constructor.

The reason is, that a) std::stringstream has no conversion operator to std::string and b) the operator << ()'s of the stringstream don't return a stringstream reference, but a std::ostream reference instead - which can not be further computed as a string stream.

The solution is to override std::stringstream and to give it better matching operators:

namespace NsStringBuilder {
template<typename T> class basic_stringstream : public std::basic_stringstream<T>
{
public:
    basic_stringstream() {}

    operator const std::basic_string<T> () const                                { return std::basic_stringstream<T>::str();                     }
    basic_stringstream<T>& operator<<   (bool _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (char _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (signed char _val)                      { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned char _val)                    { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (short _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned short _val)                   { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (int _val)                              { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned int _val)                     { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long _val)                             { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned long _val)                    { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long long _val)                        { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (unsigned long long _val)               { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (float _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (double _val)                           { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (long double _val)                      { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (void* _val)                            { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::streambuf* _val)                  { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ostream& (*_val)(std::ostream&))  { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ios& (*_val)(std::ios&))          { std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (std::ios_base& (*_val)(std::ios_base&)){ std::basic_stringstream<T>::operator << (_val); return *this; }
    basic_stringstream<T>& operator<<   (const T* _val)                         { return static_cast<basic_stringstream<T>&>(std::operator << (*this,_val)); }
    basic_stringstream<T>& operator<<   (const std::basic_string<T>& _val)      { return static_cast<basic_stringstream<T>&>(std::operator << (*this,_val.c_str())); }
};

typedef basic_stringstream<char>        stringstream;
typedef basic_stringstream<wchar_t>     wstringstream;
}

With this, you can write things like

std::string myString( NsStringBuilder::stringstream() << "my data " << 42 )

even in the constructor.

I have to confess I didn't measure the performance, since I have not used it in an environment which makes heavy use of string building yet, but I assume it won't be much worse than std::stringstream, since everything is done via references (except the conversion to string, but thats a copy operation in std::stringstream as well)

user2328447
  • 1,807
  • 1
  • 21
  • 27
3

std::string's += doesn't work with const char* (what stuff like "string to add" appear to be), so definitely using stringstream is the closest to what is required - you just use << instead of +

sergeys
  • 39
  • 1
  • "_std::string's += doesn't work with const char*_" Yes, it does, and it has since at least C++98... https://m.cplusplus.com/reference/string/string/operator+=/ – underscore_d May 26 '22 at 20:22
1

The Rope container may be worth if have to insert/delete string into the random place of destination string or for a long char sequences. Here is an example from SGI's implementation:

crope r(1000000, 'x');          // crope is rope<char>. wrope is rope<wchar_t>
                                // Builds a rope containing a million 'x's.
                                // Takes much less than a MB, since the
                                // different pieces are shared.
crope r2 = r + "abc" + r;       // concatenation; takes on the order of 100s
                                // of machine instructions; fast
crope r3 = r2.substr(1000000, 3);       // yields "abc"; fast.
crope r4 = r2.substr(1000000, 1000000); // also fast.
reverse(r2.mutable_begin(), r2.mutable_end());
                                // correct, but slow; may take a
                                // minute or more.
Igor
  • 11
  • 1
0

I wanted to add something new because of the following:

At a first attemp I failed to beat

std::ostringstream 's operator<<

efficiency, but with more attemps I was able to make a StringBuilder that is faster in some cases.

Everytime I append a string I just store a reference to it somewhere and increase the counter of the total size.

The real way I finally implemented it (Horror!) is to use a opaque buffer(std::vector < char > ):

  • 1 byte header (2 bits to tell if following data is :moved string, string or byte[])
  • 6 bits to tell lenght of byte[]

for byte [ ]

  • I store directly bytes of short strings (for sequential memory access)

for moved strings (strings appended with std::move)

  • The pointer to a std::string object (we have ownership)
  • set a flag in the class if there are unused reserved bytes there

for strings

  • The pointer to a std::string object (no ownership)

There's also one small optimization, if last inserted string was mov'd in, it checks for free reserved but unused bytes and store further bytes in there instead of using the opaque buffer (this is to save some memory, it actually make it slightly slower, maybe depend also on the CPU, and it is rare to see strings with extra reserved space anyway)

This was finally slightly faster than std::ostringstream but it has few downsides:

  • I assumed fixed lenght char types (so 1,2 or 4 bytes, not good for UTF8), I'm not saying it will not work for UTF8, Just I don't checked it for laziness.
  • I used bad coding practise (opaque buffer, easy to make it not portable, I believe mine is portable by the way)
  • Lacks all features of ostringstream
  • If some referenced string is deleted before mergin all the strings: undefined behaviour.

conclusion? use std::ostringstream

It already fix the biggest bottleneck while ganing few % points in speed with mine implementation is not worth the downsides.

CoffeDeveloper
  • 7,961
  • 3
  • 35
  • 69