6

I'm using N3936 as a reference here (please correct this question if any of the C++14 text differs).

Under 3.10 Lvalues and rvalues we have:

Every expression belongs to exactly one of the fundamental classifications in this taxonomy: lvalue, xvalue, or prvalue.

However the definition of lvalue reads:

An lvalue [...] designates a function or an object.

In 4.1 Lvalue-to-rvalue conversion the text appears:

[...] In all other cases, the result of the conversion is determined according to the following rules: [...] Otherwise, the value contained in the object indicated by the glvalue is the prvalue result.

My question is: what happens in code where the lvalue does not designate an object? There are two canonical examples:

Example 1:

int *p = nullptr;
*p;
int &q = *p;
int a = *p;

Example 2:

int arr[4];
int *p = arr + 4;
*p;
int &q = *p;
std::sort(arr, &q);

Which lines (if any) are ill-formed and/or cause undefined behaviour?

Referring to Example 1: is *p an lvalue? According to my first quote it must be. However, my second quote excludes it since *p does not designate an object. (It's certainly not an xvalue or a prvalue either).

But if you interpret my second quote to mean that *p is actually an lvalue, then it is not covered at all by the lvalue-to-rvalue conversion rules. You may take the catch-all rule that "anything not defined by the Standard is undefined behaviour" but then you must permit null references to exist, so long as there is no lvalue-to-rvalue conversion performed.

History: This issue was raised in DR 232 . In C++11 the resolution from DR232 did in fact appear. Quoting from N3337 Lvalue-to-rvalue conversion:

If the object to which the glvalue refers is not an object of type T and is not an object of a type derived from T, or if the object is uninitialized, a program that necessitates this conversion has undefined behavior.

which still appears to permit null references to exist - it only clears up the issue of performing lvalue-to-rvalue conversion on one. Also discussed on this SO thread

The resolution from DR232 no longer appears in N3797 or N3936 though.

Community
  • 1
  • 1
M.M
  • 138,810
  • 21
  • 208
  • 365
  • 1
    I think the intent is that whether an expression is an lvalue or not is determinable at compile time. So given `int *p = nullptr;`, the expression `*p` is an lvalue even though it doesn't *currently* designate an object; evaluating it causes undefined behavior. Look at the definitions of *lvalue* in the C90, C99, and C11 standards; it took the C committee a couple of decades to get the definition right. (Under the C99 definition, taken literally, `42` is an value and evaluating it causes undefined behavior -- clearly not the intent.) – Keith Thompson Oct 10 '14 at 01:40
  • 1
    @KeithThompson so you don't restrict *evaluating* it to necessitating an lvalue-to-rvalue conversion? – M.M Oct 10 '14 at 01:47
  • 1
    There is a lot of under-specified territory in this area, for example [What is the value category of the operands of C++ operators when unspecified?](http://stackoverflow.com/q/14991219/1708801) and [Does initialization entail lvalue-to-rvalue conversion? Is `int x = x;` UB?](http://stackoverflow.com/q/14935722/1708801) – Shafik Yaghmour Oct 10 '14 at 01:50
  • @ShafikYaghmour thanks for those links. I'm inclined to treat the "value category" clause as being defective. But I'm not sure what the fix is. In C11 they say `*p` is an lvalue but it causes UB if it does not designate an object when evaluated, except for the special case `&*p`. In C++ there is not such a simple solution because of the possibility of taking a reference. – M.M Oct 10 '14 at 01:58
  • @MattMcNabb I still have not gotten a satisfactory answer to [Does the standard mandate an lvalue-to-rvalue conversion of the pointer variable when applying indirection?](http://stackoverflow.com/q/21053273/1708801) and I doubt I will until [defect report 1642](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1642) is addressed. – Shafik Yaghmour Oct 10 '14 at 01:59
  • @ShafikYaghmour yes, it also seems to be underspecified exactly when lvalue-to-rvalue conversion happens. Common sense seems to say that `int &q = *p;` should not be one since it does not need to access the memory; but that means that null references are "in" again. – M.M Oct 10 '14 at 02:01
  • 1
    There is a note in section `8.3.1` *Pointers* which says *Note: in particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by indirection through a null pointer, which causes undefined behavior.* ... That note has been there since at least `N1804` – Shafik Yaghmour Oct 10 '14 at 02:08
  • @ShafikYaghmour notes are non-normative, and it does not seem to say anywhere else (besides the defects I am highlighting in this post) that indirection through a null pointer causes UB , so I would not consider this note to provide a resolution; at best it's a statement of intent but we are still left with the problem of how to rescue that intent from 3.10 and 4.1 – M.M Oct 10 '14 at 02:10
  • I think [DR 453](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3480.html#453) speaks directly to this issue and specifies the intent is that the examples are undefined behavior. Although it is not yet applied it does speak to intent, I think this may be the best you will get, it was last updated in 2012 even though it is from 2004 and so seems to be actively worked on. – Shafik Yaghmour Oct 10 '14 at 02:19
  • @ShafikYaghmour that resolution is from 2004 and the green text is not in N3936; in fact [dcl.ref]/5 in N3936 explicitly says that a reference binding to an lvalue does not cause lvalue-to-rvalue conversion – M.M Oct 10 '14 at 02:22

2 Answers2

2

It isn't possible to create a reference to null or a reference to the off-the-end element of an array, because section 8.3.2 says (reading from draft n3936) that

A reference shall be initialized to refer to a valid object or function.

However, it is not clear that forming an expression with a value category of lvalue constitutes "initialization of a reference". Quite the contrary, in fact, temporary objects are objects, and references are not objects, so it cannot be said that *(a+n) initializes a temporary object of reference type.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • OK that takes the null reference out of the equation; but it doesn't resolve `&*p`, nor what happens on lvalue-to-rvalue conversion of an lvalue that doesn't designate a valid object. – M.M Oct 10 '14 at 06:02
  • Hmmm, that does not quite jive with `DR 453` that I link to above. – Shafik Yaghmour Oct 10 '14 at 09:33
  • @ShafikYaghmour: The Standard text is normative and up-to-date, the DR isn't. – Ben Voigt Oct 10 '14 at 12:23
  • The DR seems to say the standard is underspecified and that would imply we can not use the current standard to address the question. Perhaps I am misreading it, how do you interpret the DR? – Shafik Yaghmour Oct 10 '14 at 12:24
  • @MattMcNabb: Well, the qualification "*valid* object" implies that it is possible to have an lvalue to an invalid object, in which case it appears that `*p` is exactly that. (lvalue-to-rvalue conversion wording supports the same argument, there would be no rule concerning conversion of lvalues to invalid objects if such expressions were not lvalues) – Ben Voigt Oct 10 '14 at 12:28
  • @Shafik: I think that DR clearly says that it is possible to have "an lvalue ... designates neither an existing object or function of an appropriate type", because it restricts what can be done with that lvalue. Which invalidates the premise in the question that lvalues must designate existing objects. – Ben Voigt Oct 10 '14 at 12:31
  • 1
    Agreed, which is this concept of a *empty lvalue* from `DR 232` which is also not part of the standard :-( – Shafik Yaghmour Oct 10 '14 at 12:34
1

I think the answer to this although probably not the answer you really want, is that this is under-specified or ill-specified and therefore we can not really say whether the examples you have provided are ill-formed or invoke undefined behavior according the current draft standard.

We can see this by looking DR 232 and DR 453.

DR 232 tells us that the standard conflicts on whether derferencing a null pointer is undefined behavior:

At least a couple of places in the IS state that indirection through a null pointer produces undefined behavior: 1.9 [intro.execution] paragraph 4 gives "dereferencing the null pointer" as an example of undefined behavior, and 8.3.2 [dcl.ref] paragraph 4 (in a note) uses this supposedly undefined behavior as justification for the nonexistence of "null references."

However, 5.3.1 [expr.unary.op] paragraph 1, which describes the unary "*" operator, does not say that the behavior is undefined if the operand is a null pointer, as one might expect. Furthermore, at least one passage gives dereferencing a null pointer well-defined behavior: 5.2.8 [expr.typeid] paragraph 2 says

and introduces the concept of an empty lvalue which is the result of indiretion on a null pointer or one past the end of an array:

if any. If the pointer is a null pointer value (4.10 [conv.ptr]) or points one past the last element of an array object (5.7 [expr.add]), the result is an empty lvalue and does not refer to any object or function.

and proposes that the lvaue-to-rvalue conversion of such is undefined behavior.

and DR 453 tell us that we don't know what a valid object is:

What is a "valid" object? In particular the expression "valid object" seems to exclude uninitialized objects, but the response to Core Issue 363 clearly says that's not the intent.

and suggests that binding a reference to an empty value is undefined behavior.

If an lvalue to which a reference is directly bound designates neither an existing object or function of an appropriate type (8.5.3 [dcl.init.ref]), nor a region of memory of suitable size and alignment to contain an object of the reference's type (1.8 [intro.object], 3.8 [basic.life], 3.9 [basic.types]), the behavior is undefined.

and includes the following examples in the proposal:

int& f(int&);
int& g();

extern int& ir3;
int* ip = 0;

int& ir1 = *ip;     // undefined behavior: null pointer
int& ir2 = f(ir3);  // undefined behavior: ir3 not yet initialized
int& ir3 = g();
int& ir4 = f(ir4);  // ill-formed: ir4 used in its own initializer

So if we want to restrict ourselves to dealing only with the intent then I feel that DR 232 and DR 453 provide the information we need to say that the intention is that lvalue-to-rvalue conversion of a null pointer is undefined behavior and a reference to a null pointer or an indeterminate value is also undefined behavior.

Now although it has taken a while for both of these report resolutions to be sorted out, they are both active with relatively recent updates and apparently the committee so far does not disagree with the main premise that the defects reported are actual defects. So it follows without knowing these two items it would imply it is not possible to provide an answer to your question using the current draft standards.

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
  • 1
    Before accepting this answer we also need to explain why the intent in DR 232 and DR 453 did not actually make it into C++14 – M.M Oct 11 '14 at 02:57