0

I'm returning to C++ after about a decade of other programming languages so bear with me here. The following program compiles for me in a C++20 project in CLion:

#include <iostream>

using namespace std;

class MyClass {
private:
public:
    MyClass() {
        cout << "MyClass constructor" << endl;
    }
    MyClass(const MyClass& myClass) {
        cout << "MyClass copy constructor" << endl;
    }

    MyClass& operator=(const MyClass& myClass) {
        cout << "MyClass operator=" << endl;
        return *this;
    }
    friend std::ostream& operator<< (std::ostream& os, const MyClass &myClass){
        return os << "MyClass stringifier";
    }
    ~MyClass(){
        cout << "MyClass destructor" << endl;
    }
};

MyClass& f(){
    cout << "f() entered" << endl;
    MyClass x;
    return x;
}

MyClass* g(){
    cout << "g() entered" << endl;
    MyClass x;
    return &x;
}

int main() {
    cout << "main() entered" << endl;
    MyClass& a = f();
    cout << "f() exited" << endl;
    cout << a << endl;
    MyClass* b= g();
    cout << "g() exited" << endl;
    cout << *b << endl;
}

and the output is:

main() entered
f() entered
MyClass constructor
MyClass destructor
f() exited
MyClass stringifier
g() entered
MyClass constructor
MyClass destructor
g() exited
MyClass stringifier

Now, when I hover over the return statements, CLion does give me some warnings:

Address of stack memory warning

Reference to stack memory warning

It might be interesting to note that this now quite old article mentions that the scenario generated by function f() is "not valid" C++. Yet my compiler seems to disagree.

I am surprised that this program compiles and runs without issue. Should I not be getting a "dangling reference" error at runtime where I try to print a and *b? I don't think that this falls under Return Value Optimization / Copy Elision since I am not returning a new instance of MyClass anywhere: the copy constructor is clearly not called anywhere since we can't see its printing side effect.

// Edit: The accepted answer to this SO post, posted in 2011, also seems to suggest that what f does here should not work.

Jason
  • 2,495
  • 4
  • 26
  • 37
  • 5
    *"Should I not be getting a "dangling reference" error at runtime"* - you get *undefined behavior*, which may also include the program to appear working as intended (until it suddenly doesn't) – UnholySheep Jul 24 '23 at 09:13
  • 3
    I think that you miss the point that a lack of compiler errors does not mean "valid C++". At least if "valid" in our common understanding means, among other things, "no undefined behaviour". There are plenty of ways to shoot your own foot with C++, which the compiler (and usual runtime behaviour) will silently let you do ( ... and maybe audibly snigger when you got caught in that trap - but that might just be my imagination....). – Yunnosch Jul 24 '23 at 09:13
  • 4
    "_I'm returning to C++ after about a decade_": There never existed any "dangling reference" error. Accessing a dangling reference/pointer causes _undefined behavior_ which means that you have absolutely no guarantee what will happen. That has always been this way in C++ (and for equivalent scenarios in C). – user17732522 Jul 24 '23 at 09:14
  • If you've returned to C++ after such a long time, first get back up to speed by reading the [C++ core guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines). Ans just assume most material found online is shockingly out-of-date – Pepijn Kramer Jul 24 '23 at 09:15
  • 1
    By the way compilers CAN generate warnings for your issues see https://godbolt.org/z/o7WbnTd1v. Like `:30:12: warning: reference to local variable 'x' returned [-Wreturn-local-addr]`. So be sure to crank up the warning levels as far as you can. – Pepijn Kramer Jul 24 '23 at 09:18
  • 1
    @PepijnKramer Thanks for the suggestion. After reading [this SO post](https://stackoverflow.com/questions/31790467/how-to-enable-all-compiler-warnings-in-clion) I went ahead and added `set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wextra") ` to my `CMakeLists.txt` and now I'm getting clear compile-time warnings. Based on this discussion so far I have a clear understanding of "undefined behavior" vs "no compiler warnings / no runtime errors". – Jason Jul 24 '23 at 09:23
  • Does this answer your question? [What is a dangling reference?](https://stackoverflow.com/questions/46011510/what-is-a-dangling-reference) Or any other topic about "dangling reference", "returning reference to auto storage" - there is tons of topics about it. – pptaszni Jul 24 '23 at 09:24
  • 1
    Partly @pptaszni, much more helpful has been the discussion in this comment thread. Thanks for linking to that discussion though. – Jason Jul 24 '23 at 09:25
  • **It's all undefined behavior as the comments have cited.** The reason why your program **just happens to work** is most likely because `MyClass` doesn't have any member variables of its own. Try putting some complexity into MyClass, call some other functions in between `f()` and the stream operator... and then you'll see undefined behavior at its best. – selbie Jul 24 '23 at 09:26
  • @selbie why could the existence of member variables possibly change the behavior of the runtime? – Jason Jul 24 '23 at 09:29
  • I would also use `Wpedantic`, this prevents you from accidentally using "compiler extensions" and will ensure your code is compliant with the actual standard. (e.g. this will not allow you to use variable length arrays) – Pepijn Kramer Jul 24 '23 at 09:30
  • 1
    The output operator gets a dangling reference, but it is not actually using it. Having a dangling reference or pointer is not an error - using it is. – BoP Jul 24 '23 at 09:30
  • There is by the way one use case for returning a reference, that is when you have a `static` variable in your function. (Comes with some threadsafety guarantees too, see Meyer's singleton). – Pepijn Kramer Jul 24 '23 at 09:32
  • @Jason - Having member variables might cause the code to try to access them, and trigger a real error. – BoP Jul 24 '23 at 09:33

1 Answers1

0

Here's a simple example using the same MyClass as you've provided:

int main() {
    MyClass* ptr = g();
    std::ostream << *ptr << std::endl;
}

Technically, it's undefined behavior to dereference a bad pointer like that. But as you've noted, it just happens to work. For that matter g() could even return NULL and this program still likely works. But here's why it just happens to work:

At the end of the day, those C++ methods you defined get logically compiled into functions just like C functions - except with "this" pointer accounted for and name mangling for overloads. Hence, the compiler (sans name mangling) generated a function like this:

std::ostream* MyClass_operator_ostream(std::ostream* os, MyClass* this) {
    os->write("MyClass stringifier", 19);
}

And so your main is basically doing this:

    MyClass* ptr = g();
    MyClass_operator_ostream(&cout, ptr);

However you stream operator implementation doesn't actually use the "this" pointer. So there's actually no code getting generated that would touch that bad pointer. Hence, there's no dangling references that actually get hit.

Now let's say you extended MyClass to include a member variable and then your stream operator overload references that variable.

class MyClass {

    int value;   // gets assigned by constructor

    ...

    friend std::ostream& operator<< (std::ostream& os, const MyClass &myClass){
        return os << "MyClass stringifier.  value = " << value;
    }

    ...

}

Hence, the compiler will generate code like this again:

std::ostream* MyClass_operator_ostream(std::ostream* os, MyClass* this) {
    os->write("MyClass stringifier.  value = ", 30);

    const char* tmp = some_internal_code_to_convert_int_to_string(this->x);
    
    os->write(tmp, strlen(tmp));

}

Oops, now this->x is getting used. If this is null, it will surely crash. If this is pointing to stack memory from a previously called function, it might print out the expected value. It more likely will print out whatever memory is on the stack at this moment. Especially true if another function was called in between g() and cout << *g() << endl.

Everything I've said above is in reference to your g() function that returns a pointer. But the f() function is the same. The compiler is treating references as pointers under the hood.

But technically, it's undefined behavior and you can't rely on anything I've said in this entire answer to hold. But it's the most likely explanation.

selbie
  • 100,020
  • 15
  • 103
  • 173