0

Given this code:

#include <iostream>

class Foo {
    public:
        Foo(const std::string& label) : label_(label) {}

        void print() {
            std::cout << label_;
        }

    private:
        const std::string& label_;
};

int main() {
    auto x = new Foo("Hello World");
    x->print();
}

I get

Hello World!

when I run it. If I modify it like this:

// g++ -o test test.cpp -std=c++17
#include <iostream>

class Base {
    public:
        Base(const std::string& label) : label_(label) {}

        void print() {
            std::cout << label_;
        }

    private:
        const std::string& label_;
};

class Derived : public Base {
    public:
        Derived(const std::string& label) : Base(label) {}
};

int main() {
    auto x = new Derived("Hello World");
    x->print();
}

I still get:

Hello World

but if I modify it like this:

// g++ -o test test.cpp -std=c++17
#include <iostream>

class Base {
    public:
        Base(const std::string& label) : label_(label) {}

        void print() {
            std::cout << label_;
        }

    private:
        const std::string& label_;
};

class Derived : public Base {
    public:
        Derived() : Base("Hello World") {}
};

int main() {
    auto x = new Derived();
    x->print();
}

I do not get any output. Can anyone explain this to me? I am compiling the program like this:

g++ -o test test.cpp -std=c++17

This is on Mac if it makes a difference.

anastaciu
  • 23,467
  • 7
  • 28
  • 53
morpheus
  • 18,676
  • 24
  • 96
  • 159
  • 14
    You're storing a reference to a temporary object. – tkausl Jul 28 '23 at 16:50
  • 6
    Try storing a reference to a non-temporary object. Or better yet, just store a copy of the object. The observed behaviors are both due to **undefined behavior** — any observed behavior is allowed. – Eljay Jul 28 '23 at 16:51
  • 6
    There's a bit more to the situation than just storing a reference to a temporary object though. Binding a `const` reference to a temporary object will normally extend the lifetime of the temporary to the lifetime of the reference. Claiming this has UB requires an explanation of why that shouldn't happen in this case. In my opinion it's a perfectly reasonable question. – Jerry Coffin Jul 28 '23 at 17:11
  • You are also missing the `#include `, thus the code is not guaranteed to compile in all environments. – PaulMcKenzie Jul 28 '23 at 17:18
  • 1
    The 3 cases are all undefined behaviors. The **temporarily constructed referenced object** `const std::string("Hello world")` no more exists when the `print()` function is called, curiously the already destructed object has left a usable mark in the stack in the first 2 cases, not in the last one. – dalfaB Jul 28 '23 at 17:47
  • 1
    I hope you're not going to play fast and loose with references like this in production code. It's really not worth the pain it's going to cause you. – Paul Sanders Jul 28 '23 at 18:56
  • Unrelated to the question contents: you have no reason to perform dynamic allocation through `new` in any of the sample snippets you provide. Just allocate on the stack like `Foo x;`. See https://stackoverflow.com/q/6500313/11910702 – Erel Jul 29 '23 at 17:17

2 Answers2

3

All three pieces of code are incorrect, label_ is merely a pointer to a temporary std::string object "Hello World", being a temporary you can't guarantee that the string is still at the location pointed by label_ at the time of x->print().

The compiler will issue dangling reference warnings if we use optimization, curious that only then it becomes aware of the problem.

Using compiler flags -Wall -Wextra -O3 with gcc 13.2:

https://godbolt.org/z/9xjsxhrTT

Speculating, perhaps the fact that the temporary is in main, where the object is declared, and thus within scope, despite being an argument, allows it to live long enough. In the third case the temporary is passed directly to the base constructor, and therefore it may get discarded before x->print(). main, where the action takes place, has no knowledge of the temporary.

Coming from Java or C#, where everything but primitive types are passed by reference with no worries, this may cause some confusion, the fact is that with C++ this is not the case, it is incumbent upon the programmer to choose, a reference class member will not hold outside referenced data, if it's temporary it will go away as soon as the program sees fit in its memory management. In this case, as stated in the comment section, you should pass the data by value, not by reference, the owner of label_ is Foo, it is where it's supposed to be stored.

anastaciu
  • 23,467
  • 7
  • 28
  • 53
  • 1
    >perhaps the fact that the temporary is at main scope allows it to live long enough I would bet that it's old memory still contains old values (not overwritten) but it should be dead because statement is ended. – Angelicos Phosphoros Jul 28 '23 at 22:53
  • if all 3 are incorrect, then what is the correct way to pass a string literal to a function in c++? – morpheus Jul 29 '23 at 02:24
  • 1
    @morpheus, it is not the way it's passed, it's the way it's stored, to whom does label_ belong? Is it not to the class? Then it shouldn't be a reference at all, it should simply be `const std::string _label`, change the constructor parameter type accordingly, then the data is stored where it belongs. – anastaciu Jul 29 '23 at 05:06
  • @morpheus In general, references inside objects are a very bad idea unless you know very well what you're doing and why you need them. – HTNW Jul 31 '23 at 00:10
2

Under "normal" circumstances, binding a const reference to a temporary will extend the lifetime of the temporary to the lifetime of the reference. For example, consider code like this:

std::string foo() { return "Hello World"; }

void bar() {
    std::string const& extended_life = foo();
    std::cout << extended_life << "\n";
}

The string returned by foo is a temporary object whose lifetime would normally expire at the end of the full expression in which it was created (the return statement).

But, because we bind it to a const reference, its lifetime is extended to the lifetime of the reference, so when bar prints it out, the behavior is completely defined.

That doesn't apply when the reference involved is a member of a class though. The standard doesn't directly explain why that's the case, but I suspect it's mostly a matter of what's easy or difficult to implement.

Where I have something like Foo const &foo = bar();, the compiler has to "know" the declaration of bar(), and from that its return type. It also directly "knows" that foo is a reference to const, so the connection between what was returned and the lifetime extension is fairly directly and straightforward.

When you're storing something internally in a class, however, the compiler (at least potentially) has no access to the internals of that class. For example, in your third case, the compiler could compile main having seen only this much about Base and Derived:

class Base {
    public:
        Base(const std::string& label);
        void print();
    private:
        const std::string& label_;
};

class Derived : public Base {
    public:
        Derived();
};

Based on this, the compiler has no way to know that the string passed to the ctor is related in any way to label_ or that print() uses label_.

It's only by analyzing the data flow through the contents of the classes (which may not be available when compiling the calling code) that it can figure out what label_ stores or how it's used. Demanding the compiler to analyze that code when it's potentially not available would lead to a language that couldn't be implemented. Even if all the source code was available, the relationship could be arbitrarily complex, and at some point, the compiler is no longer going to be able to determine what's going on and figure out what it needs to do.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111