29

I just discovered the most baffling error and I don't understand why the compiler did not flag it for me. If I write the following:

string s = "abcdefghijkl";
cout << s << endl;

s.substr(2,3) = "foo";
s.substr(8,1) = '.';
s.substr(9,1) = 4;
cout << s << endl;

The compiler has no problem whatsoever with this, and the assignment statements appear to have no effect, based on what's printed out. In contrast,

s.front() = 'x';

has the effect I'd expect (since front returns a reference to a character) of changing the underlying string, and

s.length() = 4;

also has the expected effect of generating a compiler error complaining that you can't assign to something that isn't an lvalue, because length returns an integer. (Well, a size_t anyway.)

So... why on earth does the compiler not complain about assigning to the result of a substr call? It returns a string value, not a reference, so it shouldn't be assignable, right? But I've tried this in g++ (6.2.1) and clang++ (3.9.0), so it doesn't seem to be a bug, and it also doesn't seem to be sensitive to C++ version (tried 03, 11, 14).

Ken White
  • 123,280
  • 14
  • 225
  • 444
blahedo
  • 834
  • 8
  • 19
  • 8
    `It returns a string value, not a reference, so it shouldn't be assignable` Why do you think that? Doesn't `string str; str="foo";` do the same? – DeiDei Dec 12 '16 at 03:37
  • 2
    Sure, but in `string str; str="foo";`, the variable `str` is clearly an lvalue. Temporary values... aren't? I mean, `int n; n=4;` also works just fine, but my third example above errors (as I expected it to). – blahedo Dec 12 '16 at 03:57
  • 2
    Yes, it's an `xvalue` with a lifetime which ends on the following line, unless given a name to become an lvalue. But xvalues must be modifiable as well. Consider this: `cout << s.substr(2,8).push_back('x');` You call `push_back` on the temporary which allows it to be printed. It's destroyed on the next line. – DeiDei Dec 12 '16 at 04:01
  • 3
    @DeiDei `std::string::push_back` returns `void`, your snippet is ill-formed. – Ruslan Dec 12 '16 at 05:50
  • @Ruslan Let's say it is `std::string::append` to support the intent of the comment. – Etherealone Dec 13 '16 at 19:49

3 Answers3

45

The result of substr() is a std::string temporary object -- it's a self-contained copy of the substring, not a view on the original string.

Being a std::string object, it has an assignment operator function, and your code invokes that function to modify the temporary object.

This is a bit surprising -- modifying a temporary object and discarding the result usually indicates a logic error, so in general there are two ways that people try to improve the situation:

  1. Return a const object.
  2. Use lvalue ref-qualifier on assignment operator.

Option 1 would cause a compilation error for your code, but it also restricts some valid use-cases (e.g. move-ing out of the return value -- you can't move out of a const string).

Option 2 prevents the assignment operator being used unless the left-hand side is an lvalue. This is a good idea IMHO although not all agree; see this thread for discussion.

In any case; when ref-qualifiers were added in C++11 it was proposed to go back and change the specification of all the containers from C++03, but this proposal was not accepted (presumably, in case it broke existing code).

std::string was designed in the 1990s and made some design choices that seem poor today in hindsight, but we're stuck with it. You'll have to just understand the problem for std::string itself, and perhaps avoid it in your own classes by using ref-qualifiers, or views or whatever.

Community
  • 1
  • 1
M.M
  • 138,810
  • 21
  • 208
  • 365
12

The reason your code compiles is because it is legal C++. Here's a link that explains what's going on.

https://accu.org/index.php/journals/227

It's pretty long, so I'll quote the most relevant part:

Non-class rvalues are not modifiable, nor can have cv-qualified types (the cv-qualifications are ignored). On the contrary, the class rvalues are modifiable and can be used to modify an object via its member functions. They can also have cv-qualified types.

So the reason you can't assign to the rvalue returned by std::string::length is because it isn't an instance of a class, and the reason you can assign to the rvalue returned from std::string::substr is because it is an instance of a class.

I don't quite get why the language is defined this way, but that's how it is.

danieltm64
  • 471
  • 2
  • 4
7

Look at the code:

s.substr(2,3) = "foo";

The function call to substr returns a string, that is an object and a temporary value. After that you modify this object (actually by calling the overloaded assignment operator from the std::string class). This temporary object is not saved in any way. Compiler simply destroys this modified temporary.

This code does not make sense. You may ask, why compiler is not giving a warning? The answer is that compiler might be better. Compilers are written by people, not gods. Unfortunately C++ allows tons of various ways of writing senseless code or code that triggers undefined behavior. This is one of the aspects of this language. It requires better knowledge and higher attention from the programmer compared to many other languages.

I just checked with MSVC 2015, the code:

std::string s1 = "abcdef";
s1.substr(1, 2) = "5678";

compiles fine.

Kirill Kobelev
  • 10,252
  • 6
  • 30
  • 51
  • In any case, it's safe to add that the compiler will not even think about generating assembly for something like this, simply because it doesn't do anything. – DeiDei Dec 12 '16 at 03:47
  • Agree. Although this is a question of optimization. Optimization might be switched off plus it is not possible to guarantee that certain optimization will take place. Most likely it **will** take place but, well if not, compiler would be still fine. – Kirill Kobelev Dec 12 '16 at 03:49
  • This doesn't answer my question at all. I understand why the code doesn't *do* anything; I don't understand why it even *compiles*. – blahedo Dec 12 '16 at 03:59
  • 1
    It compiles because it is syntactically correct. Like it or not, C++ is defined in this way. – Kirill Kobelev Dec 12 '16 at 04:02
  • Temporary values shouldn't be assignable. See for example http://ideone.com/4bjRgN where I've created a function that returns `int` instead of `string`. – Mark Ransom Dec 12 '16 at 04:02
  • But you can call methods on temporary objects (this is what is taking place here). Methods may have side effects. – Kirill Kobelev Dec 12 '16 at 04:04
  • @MarkRansom it is calling `operator=` on `std::string`. In this case, `operator=` doesn't have other side effects. However other class may choose to do anything in `operator=`. So this code should definitely compile. – Bryan Chen Dec 12 '16 at 04:07
  • 2
    @MarkRansom -- that's a difference between builtin types and user-defined types. Operations on user-defined types can have visible side effects. Member function calls on rvalues have always been legal. – Pete Becker Dec 12 '16 at 04:08
  • @BryanChen so you're saying that built-in types are fundamentally different from class types in this regard? – Mark Ransom Dec 12 '16 at 04:09
  • @PeteBecker if that's so, then it needs to be part of the answer for it to make any sense. – Mark Ransom Dec 12 '16 at 04:10
  • It seems like this could have been prevented by making `substr()` return `const string`: http://stackoverflow.com/questions/8716330/purpose-of-returning-by-const-value – jtbandes Dec 12 '16 at 04:28
  • You are right, but in this case this would result in deep copy (instead of potentailly move copy) here^ `auto s3 = s1.substr(1,2);`. – Kirill Kobelev Dec 12 '16 at 04:38