Difference between string += s1 and string = string + s1

Question

One of my programs is exceeding the time limit when I am using fans = fans + s[i], while when I am using fans += s[i] it is being accepted... Why does this happen? To Explain more , fans is a string and s is also a string so while iterating over string s i want only some characters of s so i am creating a new string fans.Now there are two ways in which i can add character to my new string fans. The Problem is mentioned below

fans = fans + s[i]; // gives Time limit exceeded 
fans += s[i];       // runs successfully

One makes a temporary, the other doesn't. Move-assignment can make the former more efficient, but it can't get any better than the later. — WhozCraig, Jul 21 '19 at 10:38
Please extract and provide a [mcve], your question is off-topic without it. As a new user, also take the [tour] and read [ask]. — Ulrich Eckhardt, Jul 21 '19 at 10:43
So `fans` is an std::string and `s[i]` is a char? Or is it another string containing a single character? — Fabio says Reinstate Monica, Jul 22 '19 at 01:01
@DanielKamilKozar What data structure would you suggest for rapidly updated textual data which gets larger in each iteration? — Aykhan Hagverdili, Jul 22 '19 at 09:02
My question is put on hold as it is off topic , What changes should i Make ?? I don't know why it is put oh hold even though it has recieved upvoyes as well as multiple answers — Naman, Jul 22 '19 at 19:54
@MatthieuM. My question is put On hold Please suggest me how to edit this as i dont want it to be closed as it has many upvotes and is never been asked before.It will be very kind if u help what specific change to made as many of them can understand my question well — Naman, Jul 22 '19 at 20:15
@Naman Have you read Ulrich Eckhardt's comment, and mine? At the very least you should clarify what type `fans` and `s[i]` are, but actually it would be better if you [edit]ed your question and added a full program ([mcve]) that we can copy and compile. Just check the links that you've been given. It won't take long, and it will improve your question significantly. Thank you! — Fabio says Reinstate Monica, Jul 23 '19 at 00:45
@FabioTurati I have made the necessary changes to my question . I hope it now explains the question well.Actually Posting the whole code is not relevant as the code is lengthy and the problem is with this part only which I have mentioned .Is there any more changes u suggest? — Naman, Jul 24 '19 at 04:24
@UlrichEckhardt Please Have a look i Have made the question more clear... Can it be removed from On hold status as i think Its a good question and many of them are answering it as well...U can also suggest what else to do? — Naman, Jul 24 '19 at 04:43
You're not supposed to paste the code of the program you're working on. You are expected to extract a [mcve] from that code and post that. It should be possible to take it and compile it without changes. It shouldn't contain anything that is not necessary to demonstrate the issue. Your question still lacks that, so it is still off-topic. — Ulrich Eckhardt, Jul 24 '19 at 16:31

Aykhan Hagverdili · Accepted Answer · 2020-05-10T22:33:04.300

29

For built-in types a += b is exactly the same as a = a + b (except that a is evaluated only once), but for classes, those operators are overloaded and call different functions.
In your example fans = fans + s[i] creates a temporary string, and assigns (moves) it to fans, but fans += s[i] does not create that temporary, hence it may be faster.

edited May 10 '20 at 22:33

answered Jul 21 '19 at 10:45

Aykhan Hagverdili

28,141
6
41
93

3

"+=" may also cause a relocation if the string size is already as big as the preallocated buffer. If later on this turns out to be a problem (of course this should be measured), and if the size may be estimated in advance, the buffer size may be increased explicitly using string::reserve(). – IMil Jul 22 '19 at 02:07
@IMil In fact, depending on how the implementation handles allocation, it's quite possible that there's no difference between the two methods, isn't it? With a naive implementation I'd expect that both methods take almost exactly the same time (the time of 1 allocation). – MC ΔT Jul 22 '19 at 04:21
1

@MCΔT that's right. So, it would make sense to suggest to the OP that if performance is an issue, one should understand how standard containers work and what they do and don't guarantee in terms of speed and memory. Modifying an std::string inside a tight loop is definitely wasteful, but without knowing context it's hard to suggest replacement. – IMil Jul 22 '19 at 04:38

score 12 · Answer 2 · answered Jul 21 '19 at 10:51

12

std::string has members operator + and operator +=. The former is usually implemented with the latter by way of an intermediate temporary. Effectively looking something like this (check your implementation source if you want to know exactly what yours does):

/// note reference return type
std::string& operator +=(char c) 
{
    this->append(c);
    return *this;
}

// note value return type
std::string operator +(char c) const
{
    std::string tmp = *this;
    tmp += c; // or just tmp.append(c) directly
    return tmp;
}

The setup of tmp is expensive. The overall function can (and usually is) made better with move-assignment semantics to the final destination on the caller-side, but the expense of the temporary is none-the-less still there. Do it a few times and you won't notice the difference. Do it thousands, or millions, etc. of times, and it can mean a world of difference.

answered Jul 21 '19 at 10:51

WhozCraig

65,258
11
75
141

Implementing `string+anything` or `string+=anything` in terms of the other looks to always be inefficient (one allocation and deallocation too much). Are you sure there are implementations doing that? – Deduplicator Jul 21 '19 at 20:24
@Deduplicator `+=` is usually the more efficient of the two (though there may be outliers). It is relatively common to implement `+` by calling `+=` as above, and this does not incur any significant overhead. Both operators typically accept their RHS argument by const reference (if it is something non-trivial), so there is no extra copying of that argument, merely plain function call overhead -- and even that will often be eliminated if the compiler inlines it. Similarly NRVO or one of its friends means that `operator+` only creates one temporary, not two. – Miral Jul 22 '19 at 06:49
1

@Miral Let's go through it, ok? += using +, the string will be newly allocated, using any extra capacity available is impossible. + using +=, the string will allocate for a copy of the lhs, and must then reallocate to accomodate the rhs too. With types where addition is not concatenation one generally doesn't have that problem, but this is about strings featuring a capacity, specifically `std::string`. – Deduplicator Jul 22 '19 at 08:51
As I said, normally you would not implement `+=` by calling `+`, because that is completely inefficient. For `+` calling `+=`, yes, you're correct that the exact implementation above is not as efficient as it could be, since it _might_ allocate twice (but if the appended rhs fits into the capacity of the lhs, then that won't happen). For efficiency a different implementation would commonly be used in practice which reserves the appended size before calling `+=` (or the moral equivalent). That was outside the scope of this answer, though. – Miral Jul 22 '19 at 23:47

score 11 · Answer 3 · edited Jul 22 '19 at 09:43

11

If you use fans=fans+s[i], the string will be copied in every loop pass. The new element will be added to the copy of the string and the result will be reassigned to the variable fans. After this the old string will have to be removed because it is not referenced anymore. This takes a whole lot of time.

If you use the augmented assignment fans+=s[i] the string will not be copied in every loop pass and there is no need of removing the reference variable as there is no reference variable here. This saves a lot of time.

I hope now you can understand!!

edited Jul 22 '19 at 09:43

yassin

642
2
7
18

answered Jul 21 '19 at 11:00

Sahil Srivastav

107
2

In modern C++ there is no need to "copy" the string - move semantics can apply. – Alnitak Jul 22 '19 at 12:32
@Alnitak, I believe move semantics can make `string = string + s1` better than *without* move semantics, but not as good as `string += s1`. I believe the former will always have at least one allocation and one full copy of all of the bytes of both `string` and `s1`, while the latter (`string += s1`) copies only the bytes of `s1` and does not make any extra allocation (as long as the result can fit in the already-allocated capacity of `string`). Please see my related post [https://stackoverflow.com/q/57151379/1245420] and if I'm wrong, please correct me! – phonetagger Jul 22 '19 at 18:21
@phonetagger I was alluding to the additional copy caused by the `=` operator without move semantics available. With move semantics, the result of string + s1 can be reassigned to string without copy overhead. But yes, there'd be a temporary created to make `string + s1` in the first place, which does also involve a copy operation. – Alnitak Jul 23 '19 at 10:08
In `fans = fans + s[i]`, the result of `operator+` must materialize a temporary object ; which can be moved out of into `fans`, but this still necessitates a complete copy of the original content of `fans` existing simultaneously with the original content, until this move happens and then the original content is discarded – M.M Jul 24 '19 at 04:33

score 3 · Answer 4 · answered Jul 22 '19 at 13:39

For fundamental types, a = a + b and a += b mean the same thing.

For arbitrary class types, a = a + b and a += b are unrelated; they look up different operators, and those operators can do arbitrary things. Them being actually unrelated is code smell, a sign of a design problem.

a = a + b becomes operator=( a, operator+( a, b ) ) roughly; the actual lookup rules are a bit more complex (involving member operators and non-member operators, and the fact that = doesn't have a non-member operator, etc), but that is the core of it.

a += b becomes operator+=( a, b ) in a similar sense.

Now, it is a common pattern to implement + in terms of +=; if you do this, you get:

a = a + b

becomes

a = ((auto)(a) += b);

where (auto) is the new c++20/c++23 "create a temporary copy of the argument" feature.

Fundamentally, a+=b can reuse the contents of a directly, while a = a + b cannot; at the moment a+b is evaluated, it doesn't know that a will be soon overwritten.

Some libraries deal with this using a technique known as "expression templates"; a+b isn't a value, but rather a compile-time description of the expression a+b, which when assigned to a is actually used to populate a with data. With expression templates, the fundamental issue of a+=b knowing more than a=a+b is eliminated.

Now, for std::string specifically, a+b creates a temporary string object, then a=(a+b) moves that into a (it can reuse the buffer of the temporary string object or the buffer of a, the standard is silent on this matter).

a+=b must reuse any excess capacity in the a buffer. So if you a.reserve(1<<30) (1 billion), a+=b cannot allocate more.

Could you link to further discussion of this new `(auto)` feature? — M.M, Jul 24 '19 at 04:28
@m.m not discussion, but a paper: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0849r1.html — Yakk - Adam Nevraumont, Jul 24 '19 at 10:35
Thanks. Probably should allow `static_cast(x)` for consistency , i.e. `T(x)` means `static_cast(x)` and vice versa for everything else — M.M, Jul 24 '19 at 12:05
@M.M `T(x)` does not mean `static_cast(x)`, nor vice versa, in general. It does sometimes. And I'd argue that verbosity here is a sin; C++ standard tends to be over verbose in initial versions of things, and we should fight against that tendency. — Yakk - Adam Nevraumont, Jul 24 '19 at 12:34

Difference between string += s1 and string = string + s1

4 Answers4

Linked