15

Consider this:

int main(int, char **) {
  int variable = 21;
  int array[1] = {21};
  using ArrayOf1Int = int[1];
  (*reinterpret_cast<ArrayOf1Int *>(&variable))[0] = 42;
  *reinterpret_cast<int *>(&array) = 42;
  return 0;
}

Did I just violate the strict aliasing rule?

Or, as in this comment that led me to this question: Is a variable an array of size 1?

Note that I tagged this as language-lawyer question. Thus I'm not interested in -fno-strict-aliasing or compiler specific behavior, but instead in what's said in the standard. Also I think it would be interesting to know if and how this changed between C++03, C++11, C++14, and newer versions.

Community
  • 1
  • 1
Daniel Jour
  • 15,896
  • 2
  • 36
  • 63
  • Well the interface class of an array is different than that of a variable no? – Khalil Khalaf Sep 08 '16 at 22:33
  • 1
    Contents of an array are, by definition, consecutively laid out. Therefore an array of size 1 is indistinguishable from a single variable. Same layout. Of course, referencing an array gets you the pointer to the first element of the array, and referencing a non-array object gets you an lvalue for that object. – Sam Varshavchik Sep 08 '16 at 22:35
  • 1
    @FirstStep But isn't an array a "sequence" (non-official term) of objects of its base type? Thus the dynamic type of the actual object I'm accessing should - in both cases - match the type of the pointer. – Daniel Jour Sep 08 '16 at 22:38
  • @Daniel: my comment that led you here was with respect to memory layout, and pointer arithmetic. There is one very obvious difference in other respects, namely that an expression denoting a single non-array variable, e.g. an `int` variable, doesn't decay to a pointer to that variable. And so it can't be indexed. :) – Cheers and hth. - Alf Sep 08 '16 at 22:45
  • What has `decltype` to do with this? Can you tidy the question up to be just about arrays? – Kerrek SB Sep 08 '16 at 22:49
  • I would say a variable is definitely an array of `1` element but that does not mean its identifier is of *array type*. It isn't. Taking the address of an array and taking the address of its first element yield the same address and the element of the array being the same type of the individual variable means they will share the same alignment. Therefore an array of a given type will share the same alignment as a variable of that same type. – Galik Sep 08 '16 at 22:51
  • @KerrekSB `decltype` has nothing to do with the actual question (sorry if that was misleading) but rather with my failure to correctly specify the type ("pointer to array of 1 int"). – Daniel Jour Sep 08 '16 at 22:53
  • 1
    Use an alias: `using A = T[1];` – Kerrek SB Sep 08 '16 at 22:53
  • The title should be something like "Does (this cast) break the strict aliasing rule", if your question is really about that rule as suggested by the text. – M.M Sep 08 '16 at 23:38
  • casting to `T&` is simpler than casting to `T *` then dereferencing; and defined to be identical in behaviour – M.M Sep 08 '16 at 23:39
  • IMO the strict aliasing rule is extremely underspecified (this is a problem in C and C++). I feel this case should not violate the rule. As an example of the problems with the rule, consider `int a; struct S { float f; int b; } s; (int &)s = 5;`. This doesn't violate the rule because the stored value of `s` is accessed by lvalue of type `int`, which **is** amongst the members of `s`. But this code feels like it should in fact be a violation. You could say "Well it means the value of `s.f`, not `s`" but that interpretation leads to a bunch of other problems (too big to fit in comments...) – M.M Sep 08 '16 at 23:49
  • Its the other way around, array by definition is a continous memory allocation. Each array index will act as a variable. Although, if an array has only one index then its identical to a variable declaration. If you have a pointer with address of 0 index of an array, you can access all the elements by incrementing pointer value. – Prateek Gupta Sep 09 '16 at 04:10
  • @M.M Sorry for the title (A bit "catchy", I know ..). Do you have a suggestion for a better title? ("Does (this cast) break the strict aliasing rule" seems to be far to generic IMO) – Daniel Jour Sep 09 '16 at 08:47
  • replace (this cast) with details of your cast – M.M Sep 09 '16 at 08:56

4 Answers4

9

Clearly, if an object were a variable of an array of size one, you could initialise a reference to an array of size one with an object:

int variable{21};
int (&array)[1] = variable; // illegal

However, the initialisation is illegal. The relevant clause for this is Clause 4 [conv] (Standard Conversions) which stated in paragraph 1:

Standard conversions are implicit conversions with built-in meaning. Clause 4 enumerates the full set of such conversions.

This clause is too long to quote here but it has nothing to say about a conversion of an object to a reference of an array of any size. Similarly, the section in reinterpret_cast (5.2.10 [expr.reinterpret.cast]) does not spell out any behaviour involving arrays but does spell out this exclusion in paragraph 1:

... Conversions that can be performed explicitly using reinterpret_cast are listed below. No other conversion can be performed explicitly using reinterpret_cast.

I don't think there is an explicit statement that an object is not an array of one object but there are sufficient omissions to make the case implicitly. The guarantee given by the standard relating objects and array is that pointer to an object behave as if they are pointing to array of size 1 (5.7 [expr.add] paragraph 4):

For the purposes of these operators, a pointer to a nonarray object behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

The presence of this statement also implies that array objects and nonarray objects are different entities: if they were considered the same this statement wouldn't be necessary to start with.

With respect to prior (or future) versions of the standard: although the exact words in the different clauses may have changed, the overall situation didn't change: objects and arrays were always different entities and, so far, I'm not aware of an intent to change that.

Baum mit Augen
  • 49,044
  • 25
  • 144
  • 182
Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • Re "No other conversion can be performed explicitly using reinterpret_cast.", the list has never been complete since it only includes one-way conversion between a reference to struct and to its first member, while in the section on class members, it's two way (as I recall the wording is "vice versa"). But whenever that omission has been mentioned it's been met with vigorous denials and argumentation. Anyway, wrt. this answer, I think it's tangential to the intent of the question: I can't imagine that OP thought that the standard allows indexing of e.g. an `int` variable, that that's the issue. – Cheers and hth. - Alf Sep 08 '16 at 23:35
  • 5
    I think this misses the point of the question: the syntax for using a variable is different to the syntax for using an array of size 1 of course, but OP is trying to get at something like: does a variable have identical properties as the array, syntax aside. – M.M Sep 08 '16 at 23:40
7

reinterpret_cast only behaves predictably in C++11 and above, so neither line is guaranteed to have defined behaviour before C++11. We'll proceed assuming C++11 or above.

First line

(*reinterpret_cast<decltype(&array)>(&variable))[0] = 42;

In this line, dereferencing the reinterpret_cast yields a glvalue but does not access the int object through that glvalue. By the time the int object is accessed, the glvalue referring to the array has already been decayed into a pointer to that object (that is, an int*).

However, one can "contrive" a case that looks like it might contain a strict aliasing violation, like so:

struct S {
    int a[1];
};
int variable = 42;
S s = reinterpret_cast<S&>(variable);

This does not violate strict aliasing because you are allowed to access an object through a subobject of an aggregate or union type. (This rule has existed since C++98.)

Second line

*reinterpret_cast<decltype(&variable)>(&array) = 42;

The reinterpret_cast is guaranteed to give a pointer to the first subobject of the array, which is an int object, so assigning to it through an int pointer is well-defined.

Brian Bi
  • 111,498
  • 10
  • 176
  • 312
  • That point about the "glvalue referring to the array has already been decayed into a pointer" really helped a lot and I think is spot-on. – Daniel Jour Sep 08 '16 at 23:06
  • Are you sure the example with `S` is legit? Doesn't the copy necessitate lvalue-to-rvalue conversion on an value that doesn't have the same type as the object that's being accessed? – Kerrek SB Sep 08 '16 at 23:53
  • @KerrekSB Is lvalue-to-rvalue conversion bound by stricter rules than strict aliasing? I can't see any in [conv.lval]. – Brian Bi Sep 08 '16 at 23:57
  • 2
    @Brian: I don't think it's "stricter", only that the existing rules seem to apply: You're accessing an object of type `int` through a glvalue of type `S`. Which is forbidden. So why do you think that's OK? Note that `variable` is not a subobject of an object of type `S`. It's a complete object already. – Kerrek SB Sep 09 '16 at 08:31
6

One recent draft says:

§[expr.unary.op]/3:

The result of the unary & operator is a pointer to its operand. [...] For purposes of pointer arithmetic (5.7) and comparison (5.9, 5.10), an object that is not an array element whose address is taken in this way is considered to belong to an array with one element of type T.

The types we're dealing with here are all really pointers, but we're (eventually) dereferencing them. As such, this probably isn't enough to render the behavior defined (but it's a close call).

As for changes between versions: that wording is in N4296 (a draft in between C++14 and C++17) but not N4140 or N3337 (basically C++14 and C++11 respectively).

The C11 standard has vaguely similar language for fscanf_s and fwscanf_s (§K.3.5.3.2/4):

The first of these arguments is the same as for fscanf. That argument is immediately followed in the argument list by the second argument, which has type rsize_t and gives the number of elements in the array pointed to by the first argument of the pair. If the first argument points to a scalar object, it is considered to be an array of one element.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
1

An array of 1 integer is not layout-compatible with an integer.

This means:

struct A {
  int x;
};
struct B {
  int y[1];
};
A a={0};
std::cout << ((B*)&a).y[0];

is not defined behavior. See [basic.types]/11 for the definition of layout-compatible.

A::x and B::y are not the same types from [basic.types]/10 -- one is under [basic.types]/10.2 (scalar type) and the other under [basic.types]/10.4 (array of literals). They are not layout-compatible enumerations. They are not class types, so [class.name]/20-21 does not apply.

Thus [class.name]/20 (common initial sequence) does not consider x and y to be a common initial sequence.

I am unaware of a compiler that does not make A and B actually bit-for-bit identical, but the standard states that the above reinterpretation is ill-formed, and as such compilers are free to assume that it will never be done. This can lead to optimizers or other exploiters of strict aliasing to cause unexpected behavior if you depend upon it.

I personally would think it would be a good idea to state that an array T[N] is layout-compatible with a sequence of N adjacent Ts. This would permit a number of useful techniques, such as:

struct pixel {
  union {
    struct {
      char r, g, b, a;
    } e;
    std::array<char,4> pel;
  };
};

where pixel.pel[0] is guaranteed to correspond to pixel.e.r. But to the best of my knowledge, this is not legal.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524