5

Suppose I want to dynamically allocate space for an int and write the maximum representable value into that memory. This code comes to mind:

auto rawMem = std::malloc(sizeof(int));         // rawMem's type is void*
*(reinterpret_cast<int*>(rawMem)) = INT_MAX;    // INT_MAX from <limits.h>

Does this code violate C++'s rules about strict aliasing? Neither g++ nor clang++ complain with -Wall -pedantic.

If the code doesn't violate strict aliasing, why not? std::malloc returns void*, so while I don't know what the static and dynamic types of the memory returned by std::malloc are, there's no reason to think either is int. And we're not accessing the memory as a char or unsigned char.

I'd like to think the code is kosher, but if it is, I'd like to know why.

As long as I'm in the neighborhood, I'd also like to know the static and dynamic types of the memory returned by the memory allocation functions (std::malloc and std::operator new).

Community
  • 1
  • 1
KnowItAllWannabe
  • 12,972
  • 8
  • 50
  • 91
  • if it would be illegal, it would defeat the purpose of `malloc` – bolov Jun 19 '16 at 05:08
  • You can use `memcpy` to rigorously adhere to the strict aliasing rules, so even if the code I'm showing isn't legal, there is a legal way to achieve the same thing. The question is whether you have to use a `char`-based mechanism like `memcpy`. I'm hoping not, but the Standard seems to say that my code isn't valid. – KnowItAllWannabe Jun 19 '16 at 05:37
  • `new` calls `malloc`. It's safe. – bolov Jun 19 '16 at 05:45
  • 1
    If it's safe, that's great, but I'd like to have some reference to the strict aliasing rules in the Standard that explain why it's safe. – KnowItAllWannabe Jun 19 '16 at 06:11
  • At the point of assignment you have two pointers pointing to this object in your program - one of type `void*`, and another of type `int*`. – milleniumbug Jun 19 '16 at 18:45

1 Answers1

4

Strict aliasing rule allows the compiler to assume that the same location in memory cannot be accessed through two or more pointers of different types.

Consider the following code:

int* pi = ...;
double* pd = ...;

const int i1 = *pi;    // (1)
*pd = 123.456;         // (2)
const int i2 = *pi;    // (3)

Analysis of this code with the strict aliasing rule in mind suggests that i2 == i1, since the location pointed by pi should not be modified between (1) and (3). The compiler can therefore eliminate one of the variables i1 or i2 (provided that the program doesn't take the address of either of them). In general, strict aliasing rule gives more freedom to the compiler while optimizing the code.

In your example you obtain a memory location through malloc(). The compiler doesn't assume any type for that memory location (i.e. both the static and dynamic type of that memory location is ... ummm... untyped raw memory, however due to the special status of the char[] type in the strict aliasing rule we can also legally treat that memory location as an array of chars). The strict aliasing rule doesn't yet apply to the new memory location for the simple reason that there is no typed pointer that can be involved in the analysis. It is you that designate a type to that location by initializing it with an object of the desired type. In case of a primitive type or a POD type, a reinterpret_cast followed by assignment (just like in your example) is a valid way to initialize that memory location, but for types with a non-trivial constructor you would need to construct an object with placement new. From that very moment, the memory location stops being raw memory, and is subject to the strict aliasing rule.

Leon
  • 31,443
  • 4
  • 72
  • 97
  • I understand the concept behind strict aliasing, but what matters to me in this question is what the standard says. My concern is that it says that the only type through which raw memory (eg., returned from malloc) can be accessed is char or unsigned char. I'm hoping I'm mistaken. – KnowItAllWannabe Jun 19 '16 at 14:21
  • @KnowItAllWannabe I tried to improve the answer so that it better addresses your question. – Leon Jun 19 '16 at 19:14
  • I think you've hit on a key point: that strict aliasing doesn't apply to untyped memory. 3.10/10 (the strict aliasing rule) applies only to accesses of "the stored value of an object," and 1.8/1 says that an object is a region of storage with a type. `malloc` returns a pointer to a region of storage, but not a pointer to an object. Hence access to the storage returned by `malloc` isn't an object access until it has a type, and in my original post, the type was applied through `reinterpret_cast`. – KnowItAllWannabe Jun 19 '16 at 22:19
  • @KnowItAllWannabe: The rules in question are very badly written. Most likely what was intended was to say that while a range of memory is being viewed as an object of a certain type, accesses need to either (1) be done directly on that object, (2) be done using a pointer of certain types, or (3) be done by taking the address of the object or part thereof, or casting a pointer to the object, and do all accesses with the resulting pointer before the next access to the original object (the fact that compilers should recognize aliasing in the latter case was probably seen at the time... – supercat Jun 23 '16 at 18:48
  • ...and for many years afterward as being too obvious to be worth mentioning, but lately it has become fashionable for aggressive compilers to pretend that aliasing can't occur even in cases where code is clearly shouting that it can. – supercat Jun 23 '16 at 18:50