35

According to " How to get around the warning "rvalue used as lvalue"? ", Visual Studio will merely warn on code such as this:

int bar() {
   return 3;
}

void foo(int* ptr) {

}

int main() {
   foo(&bar());
}

In C++ it is not allowed to take the address of a temporary (or, at least, of an object referred to by an rvalue expression?), and I thought that this was because temporaries are not guaranteed to even have storage.

But then, although diagnostics may be presented in any form the compiler chooses, I'd still have expected MSVS to error rather than warn in such a case.

So, are temporaries guaranteed to have storage? And if so, why is the above code disallowed in the first place?

Community
  • 1
  • 1
Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • **Related:** http://stackoverflow.com/questions/4301179/why-is-taking-the-address-of-a-temporary-illegal (though I'm not quite convinced that it's the same) – Lightness Races in Orbit Jan 06 '12 at 19:33
  • one of the most epic answers at SO applies to your question very well: [Can a local variable's memory be accessed outside its scope?](http://stackoverflow.com/a/6445794/1025391) – moooeeeep Jan 06 '12 at 19:35
  • 5
    MSVS is allowed to make whatever language extensions it would like. I agree with you that it's weird, though. – Carl Norum Jan 06 '12 at 19:36
  • 1
    Recent versions of Visual C++ correctly reject this code (I don't know when this was fixed; I do know that the Visual C++ 11 Developer Preview rejects the code). – James McNellis Jan 06 '12 at 19:37
  • 1
    Also, as noted by litb in the comments to my answer to the "related" question, all temporaries have storage (because a temporary is a kind of object, and an object is a region of memory), but not all rvalues are temporaries, and some rvalues do not necessarily have storage. – James McNellis Jan 06 '12 at 19:43
  • Any compiler is allowed to accept ill-formed code as an extension (as VC++ does here) as long as it "diagnoses" the issue. Warnings qualify as a diagnostic here so VC++ is technically standard-compliant by allowing this, but _only because_ of the warning. – ildjarn Jan 06 '12 at 20:11
  • @ildjarn: I know, but I'd still expect a certain level of sensibility! – Lightness Races in Orbit Jan 06 '12 at 20:21
  • **Note:** I generally require standard citations, not just assertions, in answers to questions like this. Thanks for your contributions. – Lightness Races in Orbit Jan 06 '12 at 20:22
  • This is probably the best answer: http://stackoverflow.com/questions/2280688/taking-the-address-of-a-temporary-object – MSN Jan 06 '12 at 23:42
  • @MSN: That's a question. Which answer? – Lightness Races in Orbit Jan 07 '12 at 01:51
  • 2
    http://stackoverflow.com/a/2281928/6210 – MSN Jan 07 '12 at 05:06
  • @MSN: +1 Yep that's pertinent, thanks. – Lightness Races in Orbit Jan 07 '12 at 10:27
  • I concur. Also [this one](http://stackoverflow.com/a/4301256/1182907). The important point being the fact that the error stems from getting the address of an rvalue, not of a temporary. – Loomchild Mar 16 '12 at 17:02
  • @Loomchild: That's almost a dupe, actually. The only difference between that question and mine appears to be that my question has arisen for a different reason (that is, because VS is clearly able to not care about the rule) – Lightness Races in Orbit Mar 16 '12 at 17:30
  • BTW: `foo(&(int){bar()})` is legal in C (since 1999). – Kornel Mar 19 '12 at 20:14
  • @KingsIndian: Why did you add the `c` tag? This is a C++ question. And why the Visual Studio tags? Though I use VS as a rationalisation point, the question is about C++ itself. – Lightness Races in Orbit Dec 27 '12 at 18:45
  • @LightnessRacesinOrbit I rolled it back to the previous version before my edit. Thanks. – P.P Dec 27 '12 at 18:53
  • @KingsIndian: Okay I'd already done that, but thanks =) – Lightness Races in Orbit Dec 27 '12 at 18:54

6 Answers6

59

Actually, in the original language design it was allowed to take the address of a temporary. As you have noticed correctly, there is no technical reason for not allowing this, and MSVC still allows it today through a non-standard language extension.

The reason why C++ made it illegal is that binding references to temporaries clashes with another C++ language feature that was inherited from C: Implicit type conversion. Consider:

void CalculateStuff(long& out_param) {
    long result;
    // [...] complicated calculations
    out_param = result;
}

int stuff;
CalculateStuff(stuff);  //< this won't compile in ISO C++

CalculateStuff() is supposed to return its result via the output parameter. But what really happens is this: The function accepts a long& but is given an argument of type int. Through C's implicit type conversion, that int is now implicitly converted to a variable of type long, creating an unnamed temporary in the process. So instead of the variable stuff, the function really operates on an unnamed temporary, and all side-effects applied by that function will be lost once that temporary is destroyed. The value of the variable stuff never changes.

References were introduced to C++ to allow operator overloading, because from the caller's point of view, they are syntactically identical to by-value calls (as opposed to pointer calls, which require an explicit & on the caller's side). Unfortunately it is exactly that syntactical equivalence that leads to troubles when combined with C's implicit type conversion.

Since Stroustrup wanted to keep both features (references and C-compatibility), he introduced the rule we all know today: Unnamed temporaries only bind to const references. With that additional rule, the above sample no longer compiles. Since the problem only occurs when the function applies side-effects to a reference parameter, it is still safe to bind unnamed temporaries to const references, which is therefore still allowed.

This whole story is also described in Chapter 3.7 of Design and Evolution of C++:

The reason to allow references to be initialized by non-lvalues was to allow the distinction between call-by-value and call-by-reference to be a detail specified by the called function and of no interest to the caller. For const references, this is possible; for non-const references it is not. For Release 2.0 the definition of C++ was changed to reflect this.

I also vaguely remember reading in a paper who first discovered this behavior, but I can't remember right now. Maybe someone can help me out?

ComicSansMS
  • 51,484
  • 14
  • 155
  • 166
13

You're right in saying that "temporaries are not guaranteed to even have storage", in the sense that the temporary may not be stored in addressable memory. In fact, very often functions compiled for RISC architectures (e.g. ARM) will return values in general use registers and would expect inputs in those registers as well.

MSVS, producing code for x86 architectures, may always produce functions that return their values on the stack. Therefore they're stored in addressable memory and have a valid address.

George Skoptsov
  • 3,831
  • 1
  • 26
  • 44
12

Certainly temporaries have storage. You could do something like this:

template<typename T>
const T *get_temporary_address(const T &x) {
    return &x;
}

int bar() { return 42; }

int main() {
    std::cout << (const void *)get_temporary_address(bar()) << std::endl;
}

In C++11, you can do this with non-const rvalue references too:

template<typename T>
T *get_temporary_address(T &&x) {
    return &x;
}

int bar() { return 42; }

int main() {
    std::cout << (const void *)get_temporary_address(bar()) << std::endl;
}

Note, of course, that dereferencing the pointer in question (outside of get_temporary_address itself) is a very bad idea; the temporary only lives to the end of the full expression, and so having a pointer to it escape the expression is almost always a recipe for disaster.

Further, note that no compiler is ever required to reject an invalid program. The C and C++ standards merely call for diagnostics (ie, an error or warning), upon which the compiler may reject the program, or it may compile a program, with undefined behavior at runtime. If you would like your compiler to strictly reject programs which produce diagnostics, configure it to convert warnings to errors.

bdonlan
  • 224,562
  • 31
  • 268
  • 324
  • 2
    Note that you are extremely close to undefined behaviour. The temporary introduced by `bar()` will only live until the end of the full expression, which in this case might not be entirely clear: `std::cout.operator<<((const void*)get_temporary_address(bar()))`. – Xeo Jan 06 '12 at 19:41
  • @Xeo, Indeed. However, converting the pointer to an integer without dereferencing it should be well-defined, should it not? The exact value would be unspecified, of course. – bdonlan Jan 06 '12 at 19:45
  • I know a compiler isn't required to, but I would expect a mainstream one to act sensibly. I guess this question is as much about guesstimating the extent of VS's sensibilities as it is about whether temporaries have storage. – Lightness Races in Orbit Jan 06 '12 at 19:46
  • 1
    Mainstream compilers are actually relatively lenient, so as to accept legacy programs from back when early compilers might not have checked for as much. – bdonlan Jan 06 '12 at 19:57
4

Temporary objects do have memory. Sometimes the compiler creates temporaries as well. In poth cases these objects are about to go away, i.e. they shouldn't gather important changes by chance. Thus, you can get hold of a temporary only via an rvalue reference or a const reference but not via a non-const reference. Taking the address of an object which about to go away also feels like a dangerous thing and thus isn't supported.

If you are sure you really want a non-const reference or a pointer from a temporary object you can return it from a corresponding member function: you can call non-const member functions on temporaries. And you can return this from this member. However, note that the type system is trying to help you. When you trick it you better know that what you are diing is the Right Thing.

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • I don't want to _use_ this; I'm just trying to rationalise about this diagnostic behaviour. I did already state in the question that you cannot compliantly take the address of a temporary :) – Lightness Races in Orbit Jan 06 '12 at 19:57
3

As others mentioned, we all agreed temporaries do have storage.

why is it illegal to take the address of a temporary?

Because temporaries are allocated on stack, the compiler is free to use that address to any other purposes it wants to.

int foo()
{
int myvar=5;
return &myvar;
}

int main()
{
int *p=foo();
print("%d", *p);
return 0;
}

Let's say the address of 'myvar' is 0x1000. This program will most likely print 99 even though it's illegal to access 0x1000 in main(). Though, not necessarily all the time.

With a slight change to the above main():

int foo()
{
int myvar=5;
return &myvar; // address of myvar is 0x1000
}

int main()
{
int *p=foo(); //illegal to access 0x1000 here
print("%d", *p);
fun(p); // passing *that address* to fun()
return 0;
}

void fun(int *q) 
{
 int a,b; //some variables
 print("%d", *q);
}

The second printf is very unlikely to print '5' as the compiler might have even allocated the same portion of stack (which contains 0x1000) for fun() as well. No matter whether it prints '5' for both printfs OR in either of them, it is purely an unintentional side effect on how stack memory is being used/allocated. That's why it's illegal to access an address which is not alive in the scope.

P.P
  • 117,907
  • 20
  • 175
  • 238
1

Temporaries do have storage. They are allocated on the stack of the caller (note: might be subject of calling convention, but I think they all use caller's stack):

caller()
{
 callee1( Tmp() );
 callee2( Tmp() );
}

Compiler will allocate space for the result Tmp() on stack of the caller. You can take address of this memory location - it'll be some address on stack of caller. What compiler does not guarantee is that it will preserve values at this stack address after callee returns. For example, compiler can place there another temporary etc.

EDIT: I believe, it's disallowed to eliminate code like this :

T bar();
T * ptr = &bar();

because it will very likely lead to problems.

EDIT: here is a little test:

#include <iostream>

typedef long long int T64;

T64 ** foo( T64 * fA )
{

 std::cout << "Address of tmp inside callee : " << &fA << std::endl;

 return ( &fA );
}

int main( void )
{
 T64 lA = -1;
 T64 lB = -2;
 T64 lC = -3;
 T64 lD = -4;

T64 ** ptr_tmp = foo( &lA );
 std::cout << "**ptr_tmp = *(*ptr_tmp ) = lA\t\t\t\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp << " = " << lA << std::endl << std::endl;

 foo( &lB );
 std::cout << "**ptr_tmp = *(*ptr_tmp ) = lB (compiler override)\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp << " = " << lB << std::endl
   << std::endl;

 *ptr_tmp = &lC;
 std::cout << "Manual override" << std::endl << "**ptr_tmp = *(*ptr_tmp ) = lC (manual override)\t\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp
   << " = " << lC << std::endl << std::endl;

 *ptr_tmp = &lD;
 std::cout << "Another attempt to manually override" << std::endl;
 std::cout << "**ptr_tmp = *(*ptr_tmp ) = lD (manual override)\t\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp << " = " << lD << std::endl
   << std::endl;

 return ( 0 );
}

Program output GCC:

Address of tmp inside callee : 0xbfe172f0
**ptr_tmp = *(*ptr_tmp ) = lA               **0xbfe172f0 = *(0xbfe17328) = -1 = -1

Address of tmp inside callee : 0xbfe172f0
**ptr_tmp = *(*ptr_tmp ) = lB (compiler override)   **0xbfe172f0 = *(0xbfe17320) = -2 = -2

Manual override
**ptr_tmp = *(*ptr_tmp ) = lC (manual override)     **0xbfe172f0 = *(0xbfe17318) = -3 = -3

Another attempt to manually override
**ptr_tmp = *(*ptr_tmp ) = lD (manual override)     **0xbfe172f0 = *(0x804a3a0) = -5221865215862754004 = -4

Program output VC++:

Address of tmp inside callee :  00000000001EFC10
**ptr_tmp = *(*ptr_tmp ) = lA                           **00000000001EFC10 = *(000000013F42CB10) = -1 = -1

Address of tmp inside callee :  00000000001EFC10
**ptr_tmp = *(*ptr_tmp ) = lB (compiler override)       **00000000001EFC10 = *(000000013F42CB10) = -2 = -2

Manual override
**ptr_tmp = *(*ptr_tmp ) = lC (manual override)         **00000000001EFC10 = *(000000013F42CB10) = -3 = -3

Another attempt to manually override
**ptr_tmp = *(*ptr_tmp ) = lD (manual override)         **00000000001EFC10 = *(000000013F42CB10) = 5356268064 = -4

Notice, both GCC and VC++ reserve on the stack of main hidden local variable(s) for temporaries and MIGHT silently reuse them. Everything goes normal, until last manual override: after last manual override we have additional separate call to std::cout. It uses stack space to where we just wrote something, and as a result we get garbage.

Bottom line: both GCC and VC++ allocate space for temporaries on stack of caller. They might have different strategies on how much space to allocate, how to reuse this space (it might depend on optimizations as well). They both might reuse this space at their discretion and, therefore, it is not safe to take address of a temporary, since we might try to access through this address the value we assume it still has (say, write something there directly and then try to retrieve it), while compiler might have reused it already and overwrote our value.

lapk
  • 3,838
  • 1
  • 23
  • 28