3

Thinking of this question Why can I initialize reference member with data member before initializing the data member? as well as this Why is initialization of a new variable by itself valid? following strange code came to my mind:

int main()
{
    int &ri = ri;
    ri = 0;
}

This code compiles live demo, I know this is UB, but just wonder, what happens inside compiler, where this reference actually points to?

Update: different situation happens when that reference is global: godbolt, gcc seems to declare it as a pointer and make it to point to itself

Note for downvoters: I know this code is invalid and I am not expecting particular behavior from a compiler, but I think looking inside may help to understand why following code is not illegal and how references work in the language:

struct foo {
    int &ri = i;
    int i = 0;
};

or this:

 extern int i;
 int &ri = i;
 int i = 0;

Update2: while assignment is definitely UB, it is a good question, if declaration itself:

int &ri = ri;

is UB or not. This seem to be is pure evil - ri cannot be used in any way, but declaration itself seems to be vaild.

Slava
  • 43,454
  • 1
  • 47
  • 90
  • 11
    You've already stated that you know it's UB. By definition, we can't tell you what an arbitrary compiler does. – erip Jul 19 '18 at 15:43
  • If you can read some x86 assembly, this compiler might help you: https://godbolt.org – smac89 Jul 19 '18 at 15:46
  • You can check the assembly [here](https://godbolt.org/g/qXMUQt) – NathanOliver Jul 19 '18 at 15:46
  • 2
    The reference refers to an object that's strikingly similar to the one those dangling pointers point to. – François Andrieux Jul 19 '18 at 15:48
  • clang++ -O generates an invalid opcode because it recognized the UB :-) – PaulR Jul 19 '18 at 15:54
  • @PaulR: I would be more impressed if it raised an error, considering all paths lead to UB. – Deduplicator Jul 19 '18 at 16:01
  • 1
    The whole idea of labeling a certain construct as "undefined behaviour" is that compiler builders *do not have to take this construct into account*. Most importantly, they do not have to have a *defined* behaviour when facing *undefined* behaviour in your code... – DevSolar Jul 19 '18 at 16:06
  • @DevSolar maybe I did not make myself clear, but point is I do not want to use such code and do not want to rely on particular behavior. I just wonder what happens under the hood there and understanding it may help me to understand the language better. – Slava Jul 19 '18 at 16:09
  • 1
    @Slava: The point is that the behaviour is *undefined*. The compiler might crash on even days in the month and emit a beep on uneven days. It might enter a non-terminating loop. It might start writing zeroes into the object file until it runs out of disk space. The only way to really figure it out is to try -- and remember that the next version of the compiler might "handle" the situation another way entirely. – DevSolar Jul 19 '18 at 16:13
  • 2
    There is no "understanding of the language" to be found here. The behaviour is *not defined by the language*. You're off the chart. – DevSolar Jul 19 '18 at 16:14
  • @DevSolar so what? Instead it produced assembly and looking into it helped me to understand some things better. This is what SO for, is it not? – Slava Jul 19 '18 at 16:15
  • 1
    @Slava: No, the *correct* answer is "the behaviour is undefined". You can look up what a *specific* compiler in a *specific* version does with a *specific* piece of code where this occurs, but *that has nothing to do with the language C++*. It is a detail of how said compiler reacts to erroneous input. – DevSolar Jul 19 '18 at 16:17
  • It is equivalent to `int* p = p; *p = 0;`. It should issue an _`uninitialized local variable used`_ warning, at least. – zdf Jul 19 '18 at 16:30
  • Answering my own question: `int value = createValueAndRememberPointer(&value);` –  Jul 19 '18 at 16:39
  • @ZDF kind of, but it would not compile on C++ without cast. But there is bigger difference - unlike pointer reference does not have to be represented in memory at all, which makes it so hmm strange. – Slava Jul 19 '18 at 17:35
  • @DevSolar fine, but whole concept that reference as an alias (which does not exist in memory) pointing to itself is so funny I could not resist to ask about it. – Slava Jul 19 '18 at 17:40
  • 1
    @Slava: ...and because it's so "funny" (a.k.a. not making sense), no behaviour is defined for this case. See, it's a snake biting it's own tail, really. – DevSolar Jul 19 '18 at 17:45
  • @DevSolar I know that behaviour is not defined, some questions do not need answers, they are good by themselves. I think this one is just good to think about, if community disagree, I am fine with that. – Slava Jul 19 '18 at 17:52
  • With your edits you're just making it *more* unlikely to get a good answer. The `struct foo` construct is, for example, legal for a very specific reason -- but completely unrelated to the rest of your question. – DevSolar Jul 19 '18 at 18:29
  • @DevSolar actually after some thoughts - while assignment is definetly UB, this declaration is not IMHO, so it can be analyzed and discussed – Slava Jul 20 '18 at 13:31
  • "good just to think about" - so it's like a C++ koan? This kind of thing might be informative when you're looking at a given compiler implementation (although I'm still not sure it's _useful_ information), but it's just not correct to suggest it tells you anything about the language. – Useless Jul 20 '18 at 14:38

1 Answers1

0

Not an answer, just a long comment.

I do not know what you mean by "...it would not compile on C++ without cast...".

How references must be implemented, it is not specified as far as I know. However, they must be pointers in disguise (what else could they be?). The references source and the pointers source will generate exactly the same code.

"...does not have to be represented in memory at all,..." I am not sure I understand what this means. The address of a reference is the address of the object it refers to. Therefore you might say the reference itself has no address. The following will print i's address twice:

int i;
int &ri = i;
cout << &ri << endl << &i;

Since the language allows the use of uninitialized variables, I think a good compiler must issue an warning but not an error: uninitialized variable used.

[EDIT]

I had a short discussion with someone involved in the design of C++ (I forgot to ask permission to name him). In short, banning T x = x; is desirable, but C++ was not designed from scratch: it had to be backward compatible with C.

So, you found a legal way to create an uninitialized reference. The answer to "...where this reference actually points to?" is: where an uninitialized pointer points.

zdf
  • 4,382
  • 3
  • 18
  • 29
  • 1
    "...it would not compile on C++ without cast...". I actually thought about `int *p = &p;`, my confusion – Slava Jul 19 '18 at 20:17
  • 1
    "...does not have to be represented in memory at all,..." I am not sure I understand what this means. I mean when you declare a pointer, it is represented as a variable in memory, but reference does not have to, as it can be just another name for a variable. – Slava Jul 19 '18 at 20:19
  • "I think a good compiler must issue an warning but not an error: uninitialized variable used" disagree, unlike other cases you cannot make this reference initialized at all, it is completely broken and self contained – Slava Jul 19 '18 at 20:21
  • _"...completely broken and self contained "_ `int i = i;` is legal and so it is `int& ri = ri;`. The only thing you can tell when you reach the initialization point is whether the initializer is initialized. See the edit. – zdf Jul 20 '18 at 14:42
  • No `int i = i;` is not legal, as you read from uninitialized variable, which is UB (at least for local variables) – Slava Jul 20 '18 at 14:54
  • UB does not equal illegal. `int i = i;` is legal for the name is in scope before the start of the initializer (C rule). The compiler should warn you about using an uninitialized variable (`= i`). On my compiler I get a warning or an error - depends on settings. – zdf Jul 20 '18 at 15:10
  • "UB does not equal illegal" it is matter on what you mean by illegal. Anyway what I meant is `int &ri = ri;` is not even UB, unlike `int i = i;`, at least I think it is not. – Slava Jul 20 '18 at 15:12
  • I agree this is not right, and so does the gentleman I was talking to, but it is legal and it cannot be changed for it will break the backward C compatibility (and, I guess, it is not desirable to treat the reference case differently). After all why not live with it? I didn't get warnings for using uninitialized variables for years - to me these warnings look like recent additions :o). – zdf Jul 20 '18 at 15:21