2

I want to call a function whose signature looks like this:

void foo(int (&ra)[2]);

That is, its argument is a reference to an array of two elements. Let's suppose the author of that function (who shall remain nameless) is not inclined to change the interface.

I want to invoke the function on a sub-array of an existing array. Like this:

void bar()
{
  int vals[] = { 1, 2, 3, 4, 5 };

  foo(reinterpret_cast<int (&)[2]>(vals[1]));
}

GCC accepts this code and compiles it to what I want, but I am not at all sure this is guaranteed by the language.

Is the behavior of this code implementation-defined (or undefined)? If so, is there a way to rewrite bar() to do the same thing in a well-defined way?

I am more likely to accept your answer if you cite chapter and verse of the C++03 or C++11 standard.

Nemo
  • 70,042
  • 10
  • 116
  • 153
  • Something tells me I've seen this before and it was valid, but don't quote me on it. – chris Mar 01 '14 at 20:07
  • shouldn't you have `&vals[1]` instead of `vals[1]`? – bolov Mar 01 '14 at 20:11
  • @bolov, It's a reference, not a pointer. – chris Mar 01 '14 at 20:13
  • in principle a compiler can add range checking and such. the formal UB mainly allows such things. in practice you'll be OK. – Cheers and hth. - Alf Mar 01 '14 at 20:13
  • @bolov: No; `&vals[1]` is a pointer. I am casting to a reference to an array. (I could also write this by casting `&vals[1]` to a pointer to an array and then dereferencing that. I have no idea whether that would be better-defined.) – Nemo Mar 01 '14 at 20:14
  • @chris Maybe [this similar question](http://stackoverflow.com/q/21317140/420683)? The answers should also be relevant to this question. – dyp Mar 01 '14 at 21:36
  • @dyp, It was before 2014, but thanks. – chris Mar 01 '14 at 21:39
  • I thought you were deleting all your content, leaving and never coming back? – Lightness Races in Orbit Apr 13 '14 at 01:43
  • How about `foo(*reinterpret_cast(&vals[1]));`? – L. F. Jul 23 '19 at 03:23
  • Ohh for $Deity's sake, forget about C-style arrays (and casts) already. Just use `std::array` or `std::vector` and everything will be much easier (and just as efficient). Why are people still learning C++ like it's 1998? It's frustrating. – Jesper Juhl May 03 '23 at 01:45

1 Answers1

1

It's unclear. C++11 [dcl.ref]/5 says

[...] A reference shall be initialized to refer to a valid object or function.

But what is a "valid" object? Certainly there is not an object of type int[2] at that memory location. But then again, it's widely accepted that references are allowed to bind to allocated memory into which an object of the appropriate type will later be constructed, assuming that the construction itself has well-defined behaviour. (Actually, although C++11 doesn't do this, later versions of the standard library specification use an exposition-only function called voidify in order to define std::uninitialized_copy, which will bind a reference to the uninitialized memory location before placement new is called. So basically, the library specification relies on the fact that binding a reference is allowed in at least some cases even when there isn't a live object present.)

This is the topic of CWG 453 and this issue is so old, and hasn't been discussed by CWG for such a long time, that it's anyone's guess how it will eventually get resolved (if ever), other than that the final resolution would probably get DRed back to C++98 (which can be interpreted as retroactively fixing older standards).

Once foo's parameter is initialized, another question arises of what is permitted to be done with it. For example, is this valid?

int x = ra[1];

The first step in evaluating this expression is the array-to-pointer conversion, which is described at [conv.array]:

An lvalue or rvalue of type "array of N T" or "array of unknown bound of T" can be converted to a prvalue of type "pointer to T". The result is a pointer to the first element of the array.

But there's no actual array of size 2 that ra refers to, so the first element of that hypothetical array is also hypothetical. So this would seem to give UB. But then again, who knows?

Let's also take a look at C++20, keeping in mind that C++17 changed the rules around pointers so that pointers point to objects, not to addresses. (Though they "represent" addresses.) The reinterpret_cast is governed by [expr.reinterpret.cast]/11:

A glvalue expression of type T1, designating an object x, can be cast to the type "reference to T2" if an expression of type "pointer to T1" can be explicitly converted to the type "pointer to T2" using a reinterpret_cast. The result is that of *reinterpret_cast<T2 *>(p) where p is a pointer to x of type "pointer to T1". [...]

So first a pointer to vals[1] is taken, then that pointer is reinterpret_casted to int (*)[2]. Under [expr.reinterpret.cast]/7, this first does a static_cast to void*, then a second static_cast to int (*)[2]. The first static_cast just does an implicit conversion, and under [conv.ptr]/2, an implicit conversion to cv void* leaves the value unchanged: that is, the result still points to the int object at vals[1]. The second static_cast, a static cast from void*, is governed by [expr.static.cast]/13, according to which

A prvalue of type "pointer to cv1 void" can be converted to a prvalue of type "pointer to cv2 T", where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible (6.8.3) with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.

There's no object of type T (which is int[2]) at the memory location that the void* represents (not points to) so its value (i.e. what it points to) is left unchanged; the result of the second static_cast is still a pointer to the int object at vals[1], but it has type int (*)[2]. That might seem a bit bizarre, but that's just how it is. As the last step in evaluating the reinterpret_cast, this pointer is dereferenced. Under [expr.unary.op]/1, this gives an lvalue that refers to whatever object the pointer operand points to. So the final result of the reinterpret_cast is an lvalue of type int[2] that refers to a single int object (vals[1]).

Is it valid to use that lvalue to initialize ra? Maybe; maybe not. It would result in ra being initialized to refer to an object that is alive... but it doesn't have the right type for ra. There probably should be a rule that makes that UB. But there isn't.

And what happens if you get as far as the array-to-pointer conversion? [conv.array]/1 says:

An lvalue or rvalue of type "array of N T" or "array of unknown bound of T" can be converted to a prvalue of type "pointer to T". The temporary materialization conversion (7.3.5) is applied. The result is a pointer to the first element of the array.

What array? ra doesn't even refer to an array; it refers to the int object at vals[1]. So again, I guess this is UB? But the moral of the story is that the standard is very underspecified in this area.

Brian Bi
  • 111,498
  • 10
  • 176
  • 312