3

There are many discussions of strict aliasing (notably "What is the strict aliasing rule?" and "Strict aliasing rule and 'char *' pointers"), but this is a corner case I don't see explicitly addressed.

Consider this code:

int x;
char *x_alias = reinterpret_cast<char *>(&x);
x = 1;
*x_alias = 2;  // [alias-write]
printf("x is now %d\n", x);

Must the printed value reflect the change in [alias-write]? (Clearly there are endianness and representation considerations, that's not my concern here.)

The famous [basic.lval] clause of the C++11 spec uses this language (emphasis mine):

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

  • ... various other conditions ...
  • a char or unsigned char type.

I can't figure out whether "access" refers only to read operations (read chars from a nonchar object) or also to write operations (write chars onto a nonchar object). If there's a formal definition of "access" in the spec, I can't find it, but in other places the spec seems to use "access" for reads and "update" for writes.

This is of particular interest when deserializing; it's convenient and efficient to bring data directly from a wire into an object, without requiring an intermediate memcpy() from a char-buffer into the object.

Community
  • 1
  • 1
Dan E.
  • 33
  • 3

2 Answers2

2

is it defined to _write_ to a char*, then _read_ from an aliased nonchar*?

Yes.

Must the printed value reflect the change in [alias-write]?

Yes.

Strict aliasing says ((un)signed) char* can alias anything. The word "access" means both read and write operations.

Emil Laine
  • 41,598
  • 9
  • 101
  • 157
  • Thanks! Is there a reference to back up this interpretation? (For example, does the spec define "access" somewhere I missed? In many places, it seems to use that word to mean "read" only.) – Dan E. Mar 25 '16 at 20:09
  • 1
    Aha, found it, "3.1 access to read or modify the value of an object". – Dan E. Mar 25 '16 at 20:16
  • Of course, it's only defined if the address referred to by both pointers really has an `AliasedType` declared at it. That is: you can declare an `AliasedType`, read bytes into it, then read it; but you can't declare an array of `unsigned char`, read bytes into it, then 'magic up' an `AliasedType` from its address when there wasn't one before. – underscore_d Jul 10 '17 at 14:37
0

The authors of the C89 Standard wanted to allow e.g.

int thing;
unsigned char *p = &x;
int i;
for (i=0; i<sizeof thing; i++)
  p[i] = getbyte();

and

int thing = somevalue();
unsigned char *p = &x;
int i;
for (i=0; i<sizeof thing; i++)
  putbyte(p[i]);

but not to require that compilers handle any possible aliasing given something like:

/* global definitions */
int thing;
double *p;

int x(double *p)
{
  thing = 1;
  *p = 1.0;
  return thing;
}

There are two ways in which the supported and non-supported cases differ: (1) in the cases to be supported, the access is made using a character-type pointer rather than some other type, and (2) after the address of the thing in question is converted to another type, all accesses to the storage using that pointer are made before the next access using the original lvalue. The authors of the Standard unfortunately regarded only first as significant, even though the second would have been a much more reliable way of identifying cases where aliasing may be important. If the Standard had focused on the second, it might not have required compilers to recognize aliasing in your example. As it is, though, the Standard requires that compilers recognize aliasing any time programs use character types, despite the needless impact on the performance of code that is processing actual character data.

Rather than fixing this fundamental mistake, other standards for both C and C++ have simply kept on with the same broken approach.

supercat
  • 77,689
  • 9
  • 166
  • 211