1

When bad C++ code tries to create a null reference like the following:

int &ptr2ref(int *p){
    return *p;
}

int calc(int &v){
    return v*2;
}

...
int &i = ptr2ref(nullptr);
calc(i);

At least in Visual C++ it crashed in the return statement of function calc (debug mode).

However, the answer of this question quotes

8.3.2/1:

A reference shall be initialized to refer to a valid object or function. [Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bit-field. ]

1.9/4:

Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer)

If I understand right, the standard said that as soon as a null reference being created the program behavior is being undefined. So if a compiler intended to generate useful debug information it should crash at function ptr2ref in the above example, since there is where the null reference being created (and the deferencing happening).

Am I missed something? Are there any issue stops the compiler generate such code in at lest debug mode?

Undefined Behaviour

I know people will argue that "undefined" means roughly everything. My argument is, given the fact that the standard did not specify how long a simple int main(){} shall take to compile, no one will accept a compile time to be more than a day. So the problem is about implementation options, not the standard itself. I quoted the standard here is just to say that crash on ptr2ref IS an option.

Furthermore, there is already a lot of additional checking happening in debug mode, for example, the stack was always checked to see if there are any corruption before returning from the function. Compare to those I don't think add a relatively simple check will be too expansive in debug mode.

Community
  • 1
  • 1
Earth Engine
  • 10,048
  • 5
  • 48
  • 78
  • I don't understand your question. – Bryan Chen Jul 17 '14 at 04:12
  • I mean, the popular compiler crashes at `calc` but in my opinion a good compiler shall crash at `ptr2ref`. – Earth Engine Jul 17 '14 at 04:13
  • 2
    **Undefined Behavior** means *anything can happen*. Crashing is just one possible event. – Rakib Jul 17 '14 at 04:14
  • you are right, but I am arguing what a "good" compiler should be, if being "good" is not too hard. – Earth Engine Jul 17 '14 at 04:15
  • "good" is subjective. if u dereference a `nullptr`, then never use it's value, then a good compiler may choose to ignore *dereferencing it* and just continue. – Rakib Jul 17 '14 at 04:21
  • @RakibulHasan So your opinion is such a compiler is "good". How would you reason about this? I can list some disadvantage of this option like more error-pone etc, what is its benifit? – Earth Engine Jul 17 '14 at 04:34

3 Answers3

4

"Undefined behavior" does not mean "Crash Now".

It is defined in the C++ standard, section 1.3.24

Behavior for which this International Standard imposes no requirements

[ Note: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed.

  • The program is not required to crash as soon as you bind your null reference.
  • Having a compiler generate code to check those cases would incur a dramatic overhead on the program, which is not acceptable.
Community
  • 1
  • 1
quantdev
  • 23,517
  • 5
  • 55
  • 88
  • 1
    @EarthEngine; We did read the highlighted test -- there is no requirement for debug mode to do anything differently – Soren Jul 17 '14 at 04:20
  • @Earth Engine I just added a point, and sth made it clear too – quantdev Jul 17 '14 at 04:24
  • I also mentioned **debug mode**, overhead is not an issue in this case. See my edit. – Earth Engine Jul 17 '14 at 04:28
  • @Soren I never said that compiler **MUST** do such and such, but I said doing so is an **option** and we are going to reason between those options. – Earth Engine Jul 17 '14 at 05:31
  • @EarthEngine -- point here is that you are arguing that for debug mode to be useful, it need to do something which in the specification in undefined behaviour, and your argument is that in one implementation the compiler does something and in an other it does not -- if you are dependent on that the compiler does something specific in a undefined situation you better revise your coding strategy. – Soren Jul 17 '14 at 14:50
  • @Soren -- I don't understand if this is undefined behaviour, what have to be done on the specification if it is already **ALLOWED** to do so? Also, since this is for debug only people use this feature to fix potential problems. This feature should not be rely on, but if it is there it is helpful, that is it. – Earth Engine Jul 17 '14 at 22:54
  • @EarthEngine -- but this is where you are using circular logic -- it is not allowed, because it is undefined. The fact that you can construct an example casting null pointers to a reference does neither make it allowed nor defined. – Soren Jul 18 '14 at 00:16
  • @Soren When I said "allowed" here I mean the compiler is allowed to do something, not the programmer, got it? If for example some code is allowed the compiler is not allowed to reject this program. If some code was indicated as "syntactic error" by the standard the compiler is also not allowed to accept this. However if the standard said something is "undefined" the compiler is allowed to do what ever it want. – Earth Engine Jul 20 '14 at 23:21
1

"Undefined behavior" means that anything can happen and the compiler is not obligated to do anything specific. In this case here, nothing catastrophic happens on the null pointer dereference, it just brings the program in an invalid state by creating a null reference. Which causes problems later on.

Of course it would be desirable if the error could be detected earlier, but the only way to do so would be for the compiler to add explicit null pointer checks to all dereference operations, which would only waste performance in a well behaved (no null pointers used incorrectly) program. Since null pointer dereferences usually quickly lead to crashes anyway, this is probably not seen as being worth it even in debug mode.

sth
  • 222,467
  • 53
  • 283
  • 367
  • In a normal deference points like `int i = *(nullptr)` the compiler crashes immediately in debug mode (or even in release mode). So this check is already there and it is not a problem at all. – Earth Engine Jul 17 '14 at 04:31
  • @EarthEngine: No, that's not a check. It is just your program trying to access the memory location, in that case to get the `int` value that is supposedly stored there to copy it into `i`. And then the OS shutting down the program because it tries to read from strange memory locations. (That's what I meant by "quickly leads to crashes anyway") For creating a reference you don't get that, since that's basically just a pointer under the hood. But you get the same effect when you try to access the *value* at that reference. – sth Jul 17 '14 at 04:41
  • What you said is true but it is deep implementation detail, and that can change. Did the standard said when assigning a reference by referencing a pointer **MUST NOT** access the memory location at all? If so I will accept your answer but I am afraid not. – Earth Engine Jul 17 '14 at 04:48
  • 1
    @EarthEngine yes, it does say that when doing `T &t = *p;` (and `p` has type `T *`) that the memory location pointed to by `p` is not accessed. However it also explicitly says that p must be pointing to an object at the point in time when the reference is bound. – M.M Jul 17 '14 at 04:54
  • @EarthEngine: I don't know if it says anything like that, but the null pointer case is undefined behavior anyway. And therefore why that crashes or not is an implementation detail of the compiler. According to the standard all you can say is "undefined behavior" which means that we don't know and don't care what will happen. – sth Jul 17 '14 at 04:57
1

If I understand right, the standard said that as soon as a null reference being created the program behavior is being undefined.

Yes that is correct. The text that assures this is:

A reference shall be initialized to refer to a valid object or function.

The result of dereferencing a null pointer is certainly not a valid object or function.

You also quote the following text:

[Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by dereferencing a null pointer, which causes undefined behavior. As described in 9.6, a reference cannot be bound directly to a bit-field. ]

However, "Note" means that it is non-normative, i.e. the text is meant to be explanatory but does not actually constitute a part of the standard specification. And, somewhat surprisingly, it turns out that the Standard doesn't actually say anywhere (that I'm aware of) that *p causes undefined behaviour.

It does say that lvalue-to-prvalue conversion on *p causes undefined behaviour, but it also says that this conversion is not performed in the case of binding a reference.

This came up in issue 1102 .

M.M
  • 138,810
  • 21
  • 208
  • 365
  • So, in your opinion, compiler vendors **can** choose to crash on creating null reference, is it right? – Earth Engine Jul 17 '14 at 04:59
  • 1
    Yes they could choose to crash on creating a null reference. Whether thath's a good idea or not is another matter. – M.M Jul 17 '14 at 05:01
  • So this is what my question talking about. Can you list your opinion about reasoning both options? – Earth Engine Jul 17 '14 at 05:23
  • Adding in this check would slow down correct programs, so it's not something that should be done in release mode. You'd have to ask a particular compiler vendor as to why or why not they do it in debug mode. I haven't written a compiler so I can't offer any insight here. – M.M Jul 17 '14 at 05:27