3

I'm new to the idea of reference in C++, I have a question concerning the memory allocation of reference to a pure number constant. (Another thing I want to check first is that I suspect const reference, which I frequently came across, means reference to const, but I'm not sure.)

Here is my testing on ideone.com:

#include <stdio.h>

int main() {

    const int r0 = 123;
    const int &r1 = 123;
    const int &r2 = 123;
    const int &r3 = r2;

    printf("%p\n", (void *)&r0);
    printf("%p\n", (void *)&r1);
    printf("%p\n", (void *)&r2);
    printf("%p\n", (void *)&r3);

    return 0;
}

and the result:

0x7ffee3bd74c4
0x7ffee3bd74c8
0x7ffee3bd74cc
0x7ffee3bd74cc

The reason r2 is the same as r3 is clear from this answer - How does a C++ reference look, memory-wise?, which says it's depending on compiler. But I'm thinking about why compiler doesn't also make r0,r1,r2 all the same, since all have the same pure constant value 123. (or called prvalue if no wrong search)

As a note: After some search on this site, I found a most related question - but in python. Although different language but I thought the idea should be the same/similar: from the link, if my program were written in python then there will be only one 123 is in the memory space for saving space.

Some other answers I've read:

  1. C++ do references occupy memory: This answer suggests that if it's necessary then int &x is implemented as *(pointer_to_x).
  2. How does a C++ reference look, memory-wise?: This answer suggests that compiler will try its best to save space.
Kindred
  • 1,229
  • 14
  • 41
  • I suspect the answer is because you ran the code without turning on any optimizations. – Mooing Duck Nov 22 '18 at 00:05
  • @MooingDuck: I was (re-)thinking about whether C++ would do the optimization as python before I fell asleep, then now I just wake up and realize that it should be either the converse/reverse question or C++ should handle it better. What's the canonical way/mechanism when C++ doing this? – Kindred Nov 22 '18 at 03:57
  • 1
    By default, compilers generate unoptimized builds, which make it _far_ easier to step through and debug. However, every compiler also has a flag you can pass telling it to build an optimized build, which will shrink it down and make it fast and small. How are you compiling your code? – Mooing Duck Nov 22 '18 at 06:05
  • @MooingDuck: I just clicked run on [ideone.com](https://ideone.com), I did try to find so called optimization options there but it seems like they don't have one. I haven't learned much about optimization since I rarely thinking about optimization. – Kindred Nov 22 '18 at 06:11
  • 1
    ptr_user7813604 http://coliru.stacked-crooked.com/ lets you pass -O0 through -O3 flags to the compiler, letting you see various results. (-O0 is the default of no optimization, -O3 is the maximum optimizations) – Mooing Duck Nov 22 '18 at 07:14

3 Answers3

6

Your 123 isn't a "constant". Rather, it is a literal. A literal forms an expression that is a prvalue (i.e. a temporary object initialized with the value of given by the literal). When you bind that expression to are reference, the lifetime of that object is extended to that of the reference, but the important point here is that each such object is a distinct object, and thus has a distinct address.

If you will, the text string "123" provides a rule for how to create objects, but it is not by itself an object. You can rewrite your code to make this more explicit:

const int & r = int(123);   // temporary of type "int" and value "123"

(There's no single such thing as "a constant" in C++. There are lots of things that are constant in one way or another, but they all need more detailed consideration.)

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • So since `const` is just compile time checking it has nothing to do with constant? – Kindred Nov 21 '18 at 23:05
  • 1
    A prvalue is an expression, not a temporary object. Also there may or may not be a temporary object materialized at some stage for a prvalue. Reference binding causes temporary materialization. – M.M Nov 21 '18 at 23:18
  • @ptr_user7813604: I'd rather say that "constant" is too vague a term for that question to be meaningful. Lots of things can be constant in different ways in C++ (e.g. types, expressions, initialization). – Kerrek SB Nov 22 '18 at 00:56
  • So is literal just the parameter needed to create the so called temporary object? And *the important point here is that each such object is a distinct object* is not reasonable for me, why it's distinct? – Kindred Nov 22 '18 at 06:03
  • 1
    @ptr_user7813604: Maybe it helps if you think of literals as being part of the source code, but object creation as part of the runtime behaviour? Source code doesn't create objects, only *execution* does. The source code (in this case, the literal) determines *how* objects are created at runtime. Temporary objects are distinct by their very nature; a temporary object has no aliases. No two unrelated expressions denote the same temporary object (until you start binding the temporary to some reference). That's how the language has been designed. – Kerrek SB Nov 22 '18 at 11:33
  • @M.M: That's true, but I didn't want to go too deep distinguishing (evaluated!) expressions from the objects they designate. (And the materialization thing is new in C++17, the notion of temporaries used to be somewhat less clearly defined previously.) Do feel free to post an answer that explains it (or propose an edit). – Kerrek SB Nov 22 '18 at 11:34
  • I'm reading your comment, and will give you my feedback. btw I asked a question about exactly *why it's distinct* when waiting your reply if [you could take a look](https://stackoverflow.com/q/53426022/7813604) since I think you're more clear about what I'm asking about. – Kindred Nov 22 '18 at 11:38
  • OK, I think my problem is that I haven't myself accepted that it's *by their very nature*. If I said it's because optimization can always done later, would this agree with yours? – Kindred Nov 22 '18 at 11:51
  • 1
    @ptr_user7813604: I'd say optimization is an unrelated topic that doesn't affect the present question, which is all about the object model and the nature of expressions in the language. All those are core parts of the C++ design. Ultimately the immediate answer is always "because that's how the language works", so perhaps a more satisfying question would be why the language has been designed this way. Temporary objects are *not* non-temporary objects in some global cache, they are truly temporary in the sense of being (newly, uniquely) created just at the point where they're needed. – Kerrek SB Nov 22 '18 at 11:55
4

The literal is not an object. The references do not refer to the literal. When you initialise a reference using a literal, a temporary object will be created, and the lifetime of the temporary object is bound to the lifetime of the reference.

The objects (one local variable, two temporaries) are separate and distinct objects despite having the same value. Since they're separate, they occupy separate memory locations. The standard mandates this, and that makes it possible to identify and distinguish objects based on their memory address.

eerorika
  • 232,697
  • 12
  • 197
  • 326
4

The three declaration statements:

const int &r1 = 123;
const int &r2 = 123;
const int &r3 = r2;

will initialize 3 temporary objects with lifetime extended to be equal to the scope of their respective variables. Now, there is a language rule that says:

Any two objects with overlapping lifetimes (that are not bit fields) are guaranteed to have different addresses unless one of them is a subobject of another or provides storage for another, or if they are subobjects of different type within the same complete object, and one of them is a zero-size base.

Since the references are bound to 3 distinct temporary, then you cannot observe these objects on overlapping addresses.

Interestingly, the As-if rule might probably permit the program to allocate all three temporary objects at the same address but only if your compiler and linker can theoretically prove that your program can never observe the these objects as allocated at the same address. In your example, this is infeasible since you print the address of the objects.

  • This remind me of quantum mechanics in a sense that my observation is part of the experiment. By the way I have trouble to understand this kind of language rule, *... are guaranteed to ...* for what? – Kindred Nov 22 '18 at 07:01
  • 1
    @ptr_user7813604 that's philosophical question and you will need to dive into C++ drafts and proposals to find answer. One of the probable reasons is that object identity can be used in computations - you can never distinguish r1 from r2 without having name if object they refer to is same. – Euri Pinhollow Nov 22 '18 at 07:47
  • OK I like this reason, and I will make a further induction that then the compiler can only do optimization up to a theoretical limit, where everything is almost indistinguishable. btw I like philosophical question because it doesn't necessarily have an answer. – Kindred Nov 22 '18 at 08:06
  • I don't understand your *["] Since the references are bound to 3 distinct temporary, then you cannot observe these objects on __overlapping addresses__. [."]*, does it mean that *__some__ of them has the same address*? – Kindred Nov 22 '18 at 08:16
  • @ptr_user7813604 "overlapping" refers not to the address of first byte but to the range of addresses objects occupy. This limitation means that no byte can be occupied by more than one object (except as said in excerpt). – Euri Pinhollow Nov 22 '18 at 11:46