0

in case of frequently concatenating strings to form final result, what's the best practice to do this? (in C++11 or later)

What about string::append() or ostringstream?

And another question: if I use many + to concatenating strings, will modern compiler optimize it in most efficient way? (Just like Java now can optimize string concat using StringBuilder)

Ziqi Liu
  • 2,931
  • 5
  • 31
  • 64
  • 3
    Personally I would code it so it looks good. Then profile it and determine if the performance is okay for you. – NathanOliver Jul 10 '18 at 16:28
  • See this question about c++ equivalent to Java's String Builder: https://stackoverflow.com/questions/2462951/c-equivalent-of-stringbuffer-stringbuilder – Ben Jones Jul 10 '18 at 16:31
  • I use `s1 += s2`, am sure that would be identical to `s1.append(s2)`. I avoid `std::stringstream` is speed if required because its relatively slow. – Galik Jul 10 '18 at 16:31
  • Use `string::operator+=`. It's probably exactly identical to `string::append`. `stringstream` has a different purpose. – DeiDei Jul 10 '18 at 16:32
  • On the inside `std::string` probably works more like `StringBuilder` because `Java` `String` is read-only so they can't be appended to. `std::string`, on the other hand can. – Galik Jul 10 '18 at 16:33
  • @NathanOliver never said enough. Write clean code, then look for something else if you profiled your software and realized that code is really too slow for your program. – Gian Paolo May 14 '21 at 05:39

2 Answers2

3

If you don't need any of the features of an std::ostringstream, plain append to the same string will generally be the most efficient way if your final objective is to obtain a single std::string (and calling reserve adequately beforehand can shave off some useless allocations). Internally an std::ostringstream works mostly the same way, but it adds some overhead for the extra layers.

And another question: if I use many + to concatenating strings, will modern compiler optimize it in most efficient way? (Just like Java now can optimize string concat using StringBuilder)

std::string is already something like a StringBuilder, as it's a mutable type that grows efficiently, so you don't need to use a different type for this.

Now, in C++03 multiple concatenated + will create all the relevant temporary strings, so if you wanted to keep it efficient you'll have to explicitly use += repeatedly over the target string.

OTOH, C++11 added overloads of operator+ taking rvalue references, which allow the allocated storage for the temporary strings to be recycled/expanded for the next concatenation, so in most cases efficiency should be comparable - thanks @Daniel Schepler for pointing it out.

One case where I think this would fall short is something like:

big_string += a + b + c;

In this case, a + b + c itself is computed efficiently, but without considering the fact that it's going to be appended to another string (so, probably without any extra allocation). You'd be better off with either the "traditional" method, or with

big_string = std::move(big_string) + a + b + c;
Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • 1
    If using C++11, `string::operator+` has several overloads involving rvalues which will be able to reuse the allocated storage from one temporary value to the next. – Daniel Schepler Jul 10 '18 at 16:54
  • 1
    @DanielSchepler woa that's something that I managed to miss in 8+ years of C++11. Fixing the answer... – Matteo Italia Jul 10 '18 at 17:04
  • @DanielSchepler @MatteoItalia Is this safe? I guess this rvalue-based `operator+` returns an rvalue reference to the string itself, but https://stackoverflow.com/a/66952566/6222803 says `std::string` is not _MoveAssignable_ – Tuff Contender Oct 13 '21 at 15:50
  • @TuffContender: `operator+` ([number 7](https://en.cppreference.com/w/cpp/string/basic_string/operator%2B)) returns an actual object that is distinct from `big_string`, so there should be no self-assignment problem; the return value is also an rvalue, so it should always trigger the same overload in the subsequent `+` calls, and the move assignment operator ([number 2](https://en.cppreference.com/w/cpp/string/basic_string/operator%3D)) at the end. Hence, there should be no problem nor any extra copy. – Matteo Italia Oct 13 '21 at 19:34
3

In case that somehow you think of StringBuilder in mananged realms.

You can use Alphabet (Google) Library, ABCLib, ABCL or just Abseil.

Abseil's Strings library look ahead and allocate all it need at once, then build string in it like you want. For concat job, you just need absl::StrCat() and absl::StrAppend().

I'm not any good at explaining things. Perhaps this godbolt link below may speak better than I do.

godbolt.org/g/V45pXJ

Learn more on YouTube : CppCon 2017: Titus Winters “Hands-On With Abseil” (ffw to 32min)

youtube.com/watch?v=xu7q8dGvuwk&t=32m

#include <string>
#include <iostream>
#include <absl/strings/str_cat.h>

int main()
{
    std::string s1,s2,s3,s4,
                s5,s6,s7,s8,
                s9,s10,s11,s12;
    std::getline(std::cin, s1);
    std::getline(std::cin, s2);
    std::getline(std::cin, s3);
    std::getline(std::cin, s4);
    std::getline(std::cin, s5);
    std::getline(std::cin, s6);
    std::getline(std::cin, s7);
    std::getline(std::cin, s8);
    std::getline(std::cin, s9);
    std::getline(std::cin, s10);
    std::getline(std::cin, s11);
    std::getline(std::cin, s12);
    std::string s = s1 + s2 + s3 + s4 +  // a call to operator+ for each +
                    s5 + s6 + s7 + s8 +
                    s9 + s10 + s11 + s12;

    // you shall see that
    // a lot of destructors get called at this point
    // because operator+ create temporaries

    std::string abseil_s = 
       absl::StrCat(s1,s2,s3,s4,  // load all handles into memory
                    s5,s6,s7,s8,  // then make only one call!
                    s9,s10,s11,s12);

    return s.size() + abseil_s.size();

    // you shall see that
    // only "real" s1 - s12 get destroyed
    // at these point
    // because there are no temporaries!

}


Update 2021

Today you can alternatively use fmt:format or std::format when the c++20 library implementation completed. (Current fmtlib now bumps support into c++14.)

The format will lookahead like in Abseil StrCat, so no wasted temporaries.

    string fmt_s = 
        fmt::format("{}{}{}{}{}{}{}{}{}{}{}{}",
                    s1,s2,s3,s4,  // load all handles into memory
                    s5,s6,s7,s8,  // then make only one call!
                    s9,s10,s11,s12);

[LIVE]

sandthorn
  • 2,770
  • 1
  • 15
  • 59