3

Consider the following code, which binds a temporary object to a const reference in "nested" fashion:

#include <iostream>

std::string foo()
{
    return "abc";
}

std::string goo()
{
    const std::string & a = foo();
    return a;
}

int main()
{
    // Is a temporary allocated on the heap to support this, even for a moment?
    const std::string & b = goo();
}

I have been trying to understand what the compiler must do in terms of memory storage in order to support this "nested" construct.

I suspect that for the call to foo(), memory allocation is straightforward: storage for a std::string will be allocated on the stack as the function foo() exits.

However, what must the compiler do to support storage for the object referenced by b? The stack for the function goo must unwind and "be replaced with" an object on the stack to which b refers, but in order to unwind the stack for goo, will the compiler be required to momentarily create a copy of the object on the heap (before copying it back to the stack in a different location)?

Or is it possible for the compiler to accomplish the requirements of this construct without any storage being allocated on the heap, even for a moment?

Or is it even possible for the compiler to use the same storage location for the object referred to by b as for the object referred to by a, without doing any additional allocation either on the stack or on the heap?

Dan Nissenbaum
  • 13,558
  • 21
  • 105
  • 181

5 Answers5

7

I think that there's a middle step you've failed to consider, which is that you are not binding b to a, but instead to a copy of a. And this isn't due to any fancy memory shenanigans!

goo returns by value and, as such, that value is available within the scope of the full-expression inside main per all the usual mechanisms. It'll either be in main's stack frame, or somewhere else, or (in this contrived case) likely optimised out entirely.

The only magic here is that it is kept in main's scope until b goes out of scope, because b is a ref-to-const (instead of being near-immediately destroyed).

So, will the heap come into it in any way whatsoever? Well, if you have a heap, no. If you mean the free store then, still, no.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • Where is the `copy of a` created - on the stack or on the heap? – Dan Nissenbaum Dec 05 '12 at 19:21
  • 1
    @DanNissenbaum: What you really mean to ask is "does the copy of `a` have automatic or dynamic storage duration", and the answer is this: why would it be dynamically allocated? It'll sit on the stack (maybe) just like every other copy of an object with automatic storage duration. – Lightness Races in Orbit Dec 05 '12 at 19:21
  • Because (and this could be a complete misunderstanding) the stack for the call to `goo` must be unwound *before* a copy of the object is placed on the stack for use in `main`. Therefore, a temporary copy must exist somewhere besides the stack while the stack for `goo` unwinds (or so it seems to me - but I'd like my misunderstanding to be cleared up if I'm wrong here). – Dan Nissenbaum Dec 05 '12 at 19:23
  • 3
    @DanNissenbaum: Obviously that is not the case, otherwise you'd never be able to return any value from any function, and it's really only that fundamental mechanism that you're asking us about. The "somewhere" you seek is the location in memory provided by the function call for its result to go, and what that is depends almost completely on _which_ implementation you're referring to, as it is far too low-level to be defined by the C++ language. – Lightness Races in Orbit Dec 05 '12 at 19:24
  • That's the answer - thanks for pointing it out. The storage is allocated on the stack for both `foo` and `goo` as space for the return value (as it is for any function call, temporaries and const references notwithstanding). When `goo` returns, the object is copied from the "return value location" of `foo` to the "return value location" of `goo`. – Dan Nissenbaum Dec 05 '12 at 19:30
  • @DanNissenbaum: If that's the stack, fine. I'm sticking with "return value location" though, since we're talking about C++ as an abstraction (I think). – Lightness Races in Orbit Dec 05 '12 at 19:34
  • I'm referring to any compilers that use a stack and a heap, and in those cases, the return value location is on the stack (at least that's my understanding for all such compilers currently in existence; if there are any that don't place the return value location on the stack, that would be useful to know). – Dan Nissenbaum Dec 05 '12 at 19:39
  • @DanNissenbaum: What would it be useful _for_? :) – Lightness Races in Orbit Dec 05 '12 at 19:40
  • I would be fascinated to know that a compiler exists that does not use a stack and/or heap. I think fascinating things tend to be useful in ways that are hard to predict ahead of time. But - I am right that there are no such compilers, right? – Dan Nissenbaum Dec 05 '12 at 19:46
  • 2
    @DanNissenbaum see `<` http://stackoverflow.com/questions/10900885/are-there-stackless-or-heapless-implementation-of-c `>` -- there are apparently mainframes on which real compilers create a linked list on the "heap" to handle automatic storage. – Yakk - Adam Nevraumont Dec 05 '12 at 19:59
  • @DanNissenbaum: I'm confident that they do exist. – Lightness Races in Orbit Dec 05 '12 at 20:06
4

Theoretically, since goo (and foo for that matter) returns by value, a copy of the variable referenced by a will be returned (and placed on the stack). Said copy will have its lifetime extended by b, until b's scope ends.

I think the main point you're missing is that you return by value. Which means that after foo or goo return, it really makes no difference of anything that's inside them - you're left with a temporary string which you bind to a const reference.

In practice, everything will most likely be optimized out.

Luchian Grigore
  • 253,575
  • 64
  • 457
  • 625
  • However, it strikes me that it is not possible for the lifetime of the object referenced by `a` to be extended, because the stack must be unwound: The location of the stack pointer after `goo` completes is different from the location of the stack pointer after `foo` completes. – Dan Nissenbaum Dec 05 '12 at 19:19
  • @DanNissenbaum: The object referenced by `a` has nothing to do with it. Its lifetime is _not_ extended, particularly. – Lightness Races in Orbit Dec 05 '12 at 19:23
  • 1
    @DanNissenbaum `a` is not extended. It dies after the function returns. What goo() returned is a copy of `a`, and that copy lives in main()'s stack. If you were to return a reference to `a` instead, then you would see that what you actually have in mind is not possible (holding a valid reference to an object that got destroyed.) – Nikos C. Dec 05 '12 at 19:24
4

Here is an example of what the C++ standard allows the compiler to rebuild your code as. I'm using full NRVO. Note the use of placement new, which is a moderately obscure C++ feature. You pass new a pointer, and it constructs the result there instead of in the free store.

#include <iostream>

void __foo(void* __construct_std_string_at)
{
  new(__construct_std_string_at)std::string("abc");
}

void __goo(void* __construct_std_string_at)
{
  __foo(__construct_std_string_at);
}

int main()
{
  unsigned char __buff[sizeof(std::string)];
  // Is a temporary allocated on the heap to support this, even for a moment?
  __goo(&__buff[0]);
  const std::string & b = *reinterpret_cast<std::string*>(&__buff[0]);
  // ... more code here using b I assume
  // end of scope destructor:
  reinterpret_cast<std::string*>(&__buff[0])->~std::string();
}

If we blocked NRVO in goo, it would instead look like

#include <iostream>

void __foo(void* __construct_std_string_at)
{
  new(__construct_std_string_at)std::string("abc");
}

void __goo(void* __construct_std_string_at)
{
  unsigned char __buff[sizeof(std::string)];
  __foo(&__buff[0]);
  std::string & a = *reinterpret_cast<std::string*>(&__buff[0]);
  new(__construct_std_string_at)std::string(a);
  // end of scope destructor:
  reinterpret_cast<std::string*>(&__buff[0])->~std::string();
}

int main()
{
  unsigned char __buff[sizeof(std::string)];
  // Is a temporary allocated on the heap to support this, even for a moment?
  __goo(&__buff[0]);
  const std::string & b = *reinterpret_cast<std::string*>(&__buff[0]);
  // ... more code here using b I assume
  // end of scope destructor:
  reinterpret_cast<std::string*>(&__buff[0])->~std::string();
}

basically, the compiler knows the lifetime of the references. So it can create "anonymous variables" that store the actual instance of the variable, then create references to it.

I also noted that when you call a function, you effectively (implicitly) pass in a pointer to a buffer to where the return value goes. So the called function constructs the object 'in place' in the caller's scope.

With NRVO, a named variable in the called function scope is actually constructed in the calling functions "where the return value goes", which makes returning easy. Without it, you have to do everything locally, then at the return statement copy your return value to the implicit pointer to your return value via the equivalent of placement new.

Nothing needs be done on the heap (aka free store), because lifetimes are all easily provable and stack-ordered.

The original foo and goo with the expected signature would have to still exist, as they have external linkage, until possibly discarded when it is found that nobody uses them.

All variables and functions starting with __ exist for exposition only. The compiler/execution environment no more needs to have a named variable than you need to have a name for a red blood cell. (In theory, because __ is reserved, a compiler that did such a translation pass before compiling would probably be legal, and if you actually used those variable names and it failed to compile it would be your fault not the compiler's fault, but ... that would be a pretty hackey compiler. ;) )

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
3

No, there will not be any dynamic allocation for the lifetime extension. The common implementation is equivalent to the following code transformation:

std::string goo()
{
    std::string __compiler_generated_tmp = foo();
    const std::string & a = __compiler_generated_tmp;
    return a;
}

There is no need for dynamic allocation as the lifetime will only be extended for as long as the reference is alive, and by the C++ lifetime rules that will happen at the end of the current scope. By placing an unnamed (__compiler_generated_tmp in the code above) variable in the scope, the usual lifetime rules will apply and do what you expect.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
  • OP's next question to you is "but isn't `__compiler_generated_tmp` "unwound" before it can be passed to the calling scope – Lightness Races in Orbit Dec 05 '12 at 19:26
  • Hmmm... Is that comment useful? Do I detect a touch of snarkiness? If so, it's unnecessary. If not, my apologies for suggesting so. – Dan Nissenbaum Dec 05 '12 at 19:33
  • No, not at all snarkiness. I am pre-empting in order to point out that I do not think David completely captured what I have figured out you're after. Useful? I leave that up to you. – Lightness Races in Orbit Dec 05 '12 at 19:34
  • Makes sense. You are right: That is what I was after. All of my thinking about rvalue references has made me miss the "obvious" here, so to speak. – Dan Nissenbaum Dec 05 '12 at 19:37
  • @LightnessRacesinOrbit: I am assuming that the OP understands basic C++ and knows that it is ok to return a *copy* of a local variable, which is exactly what is being done here. – David Rodríguez - dribeas Dec 05 '12 at 20:35
  • @DavidRodríguez-dribeas: Yet that's the entire basis for the original question :P – Lightness Races in Orbit Dec 05 '12 at 20:44
  • @LightnessRacesinOrbit - the basis of my original question also involved binding temporary values to `const` ref, which adds some complexity. So, returning a copy of a local variable is not the basis of my question - rather, simplifying my question to reveal that it boils *down to* returning a copy of a local variable is the *answer* to my question. You clarified this to me in your answer (as did Yakk in even greater detail). I think you make too many assumptions, here and in other comments - it borders on snarky. – Dan Nissenbaum Dec 05 '12 at 21:12
  • @DanNissenbaum: I'm certainly glad you're again performing a character analysis on my comments after the time I've spent helping you here. Further, your inference does not match the actual content of my comments. David knows me well enough to know that I'm simply participating in debate over the direction of the question; if I've had to try to fill in some gaps, it's only because the intent of the question was never entirely clear to begin with! So, perhaps you could find it in yourself to stop telling me off for things? – Lightness Races in Orbit Dec 05 '12 at 21:35
  • @LightnessRacesinOrbit I appreciate your help. Some things don't relate to the content of the topic - they relate to communication style. You're not *being* snarky, you're just bordering on it. Others, equally as helpful, don't border on it. "I'm certainly glad you're again performing a character analysis..." is an example of something akin to ... snarky? I don't know how to define it, but it's not necessary, or useful. – Dan Nissenbaum Dec 05 '12 at 21:38
  • @DanNissenbaum: It's necessary to _me_ when you've begun personally attacking me and calling me names for no apparent reason. I don't know why you are doing this on a thread full of people giving their time for you for free. Please stop, since you are not my mother. This will be my last post on this question. – Lightness Races in Orbit Dec 05 '12 at 21:44
  • @LightnessRacesinOrbit I'm not attacking you. I'm grateful for your help. Just pointing out communication style concerns. – Dan Nissenbaum Dec 05 '12 at 21:45
1

In std::string goo() , a std::string is returned by value.

When the compiler see you calling this function in main(), it notices that the return value is a std::string, and allocates space on the stack of main for a std::string.

when goo() returns, the reference a inside goo() is not valid anymore, but the std::string a references is copied into the space reserved on the stack in main()

In situations such as this, several optimizations are possbile, you can read about what one compiler can do here

nos
  • 223,662
  • 58
  • 417
  • 506
  • I'm curious if the `std::string` move constructor is guaranteed to be called as the object is copied from the storage referred to by `a` to the space reserved on the stack in `main()`. I suppose it will? – Dan Nissenbaum Dec 05 '12 at 19:44