2

We are in a hot discussion with my friends about the code:

#include <iostream>
#include <string>

using namespace std;

string getString() {
    return string("Hello, world!");
}

int main() {
    char const * str = getString().c_str();
    std::cout << str << "\n";
    return 0;
}

This code produces different outputs on g++, clang and vc++:

g++ and clang output is the same:

Hello, world!

However vc++ outputs nothing (or just spaces):

What behavior is correct? Is this may be a change in standard according to temporaries lifetime ?

As far as I can see by reading IR of clang++, it works as following:

store `getString()`'s return value in %1
std::cout << %1.c_str() << "\n";
destruct %1

Personally, I think gcc works this way too (I've tested it with rvo/move verbosity (custom ctors and dtors which prints to std::cout). Why does vc++ works other way?

clang = Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)

g++ = gcc version 4.9.2 (Debian 4.9.2-10)

VP.
  • 15,509
  • 17
  • 91
  • 161

3 Answers3

8

Your program has undefined behaviour! You are "printing" a dangling pointer.

The result of getString(), a temporary string, lives no longer than that const char* declaration; accordingly neither does the result of invoking c_str() on that temporary.

So both compilers are "correct"; it is you and your friends who are wrong.

This is why we shall not store the result of std::string::c_str(), unless we really, really need to.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • in clang++ and g++ it does not seem to be dangling - it exists and is destroyed after std::cout, according to IR, so it is something implementation-dependent? – VP. Jan 04 '16 at 14:09
  • 4
    @VictorPolevoy that's the "undefined" part of undefined behavior. :) – erip Jan 04 '16 at 14:10
  • @VictorPolevoy Yes, this is undefined behaviour, so each compiler is free to handle the case as it wants. – freakish Jan 04 '16 at 14:10
8

Both are right, undefined behaviour is undefined.

char const * str = getString().c_str();

getString() returns a temporary, which will be destroyed at the end of the full expression which contains it. So after that line is finished, str is an invalid pointer and trying to inspect it will plunge you into the land of undefined behaviour.


Some standards quotes, as requested (from N4140):

[class.temporary]/3: Temporary objects are destroyed as the last step in evaluating the full-expression that (lexically) contains the point where they were created.

basic_string::c_str is specified like so:

[string.accessors]/1: A pointer p such that p + i == &operator[](i) for each i in [0,size()].

Since strings have their contents stored contiguously ([string.require]/4) this essentially means "return a pointer to the start of the buffer".

Obviously when a std::string is destructed it will reclaim any memory which was allocated, making that pointer invalid (if your friends don't believe that, they have other problems).

TartanLlama
  • 63,752
  • 13
  • 157
  • 193
  • 1
    It doesn't even point to garbage memory! It doesn't point to anything. – Lightness Races in Orbit Jan 04 '16 at 14:09
  • 1
    @LightnessRacesinOrbit Well, it has to point to something, right? There is an address stored there and some memory value under the address. What do you mean by "it doesn't point to anything"? – freakish Jan 04 '16 at 14:24
  • 1
    @freakish Nope. I mean exactly what I said. It is a pointer with no valid dereferencability. It doesn't point to anything at all because there is nothing for it to point to. Its "value" is "mathematically" undefined. Pointers are not integer memory addresses; they are pointers. It's important to grok the abstractions at play here. – Lightness Races in Orbit Jan 04 '16 at 15:17
  • My friend are still insisting about UB and they ask me to give them a page in a c++14 standard where is it explained that this is UB. Can you help me please find this quote in the [standard](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4296.pdf)? – VP. Jan 04 '16 at 17:45
5

That is undefined behavior so anything can happen (including printing the string "correctly").

Making things "working" anyway happens quite often with UB, unless the program is actually running on a paying customer's computer or if it's shown on the big screen in front of a vast audience ;-)

The problem is that you're taking a const char * pointing inside a temporary object that is destroyed before your use of the pointer.

Note that this is not the same situation as with:

const std::string& str = getString(); // Returns a temporary
std::cout << str << "\n";

because in this case instead there is a very specific rule about references bound to temporaries in the C++ standard. In this case the lifetime of the temporary will be extended until the reference str is also destroyed. The rule only applies to references and only if directly bound to the temporary or to a sub-object of the temporary (like const std::string& s = getObj().s;) and not to the result of calling methods of a temporary object.

6502
  • 112,025
  • 15
  • 165
  • 265
  • Interesting point in that last paragraph, can you point me to a citation in the standard for it? – AndyG Jan 04 '16 at 15:03
  • @AndyG http://stackoverflow.com/questions/2784262/does-a-const-reference-prolong-the-life-of-a-temporary – VP. Jan 04 '16 at 15:05
  • @VictorPolevoy: Thanks. To save people the trouble of visiting the link, it's 12.2/4 and 12.2/5: `"There are two contexts in which temporaries are destroyed at a different point than the end of the full expression. ... The second context is when a reference is bound to a temporary. The temporary to which the reference is bound or the temporary that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference"` – AndyG Jan 04 '16 at 15:09