3

Consider this artificial example:

#include <stddef.h>

static inline void nullify(void **ptr) {
    *ptr = NULL;
}

int main() {
    int i;
    int *p = &i;
    nullify((void **) &p);
    return 0;
}

&p (an int **) is casted to void **, which is then dereferenced. Does this break the strict aliasing rules?

According to the standard:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,

So unless void * is considered compatible with int *, this violates the strict aliasing rules.

However, this is not what is suggested by gcc warnings (even if it proves nothing).

While compiling this sample:

#include <stddef.h>
void f(int *p) {
    *((float **) &p) = NULL;
}

gcc warns about strict aliasing:

$ gcc -c -Wstrict-aliasing -fstrict-aliasing a.c
a.c: In function ‘f’:
a.c:3:7: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
      *((float **) &p) = NULL;
      ~^~~~~~~~~~~~~

However, with a void **, it does not warn:

#include <stddef.h>
void f(int *p) {
    *((void **) &p) = NULL;
}

So is it valid regarding the strict aliasing rules?

If it is not, how to write a function to nullify any pointer (for example) which does not break the strict aliasing rules?

rom1v
  • 2,752
  • 3
  • 21
  • 47
  • Some pointers here: https://stackoverflow.com/questions/16160799/incompatible-pointer-type-in-c, https://stackoverflow.com/questions/25161649/explicit-cast-required-to-pointer-to-void-pointer – Ilja Everilä Sep 01 '18 at 14:17
  • gcc [strict aliasing checks can have false positives and false negatives](https://stackoverflow.com/a/25118277/1708801) – Shafik Yaghmour Sep 01 '18 at 16:21
  • If code needs any semantics beyond what the authors of gcc/clang feel like supporting, the only reliable way to ensure generation of machine code that upholds such semantics is to use `-fno-strict-aliasing`. The fact that tweaking one's code a certain way will allow it to compile without warnings says nothing about whether the generated code will actually work. To make matters worse, gcc and clang will sometimes optimize out code which they recognize as being incapable of changing the stored pattern of bits in a region of storage, even if it would change the effective type of that storage. – supercat Sep 04 '18 at 16:32

1 Answers1

1

There is no general requirement that implementations use the same representations for different pointer types. On a platform that would use a different representation for e.g. an int* and a char*, there would be no way to support a single pointer type void* that could act upon both int* and char* interchangeably. Although an implementation that can handle pointers interchangeably would facilitate low-level programming on platforms which use compatible representations, such ability would not be supportable on all platforms. Consequently, the authors of the Standard had no reason to mandate support for such a feature rather than treating it as a quality of implementation issue.

From what I can tell, quality compilers like icc which are suitable for low-level programming, and which target platforms where all pointers have the same representation, will have no difficulty with constructs like:

void resizeOrFail(void **p, size_t newsize)
{
  void *newAddr = realloc(*p, newsize);
  if (!newAddr) fatal_error("Failure to resize");
  *p = newAddr;
}

anyType *thing;

... code chunk #1 that uses thing
   resizeOrFail((void**)&thing, someDesiredSize);
... code chunk #2 that uses thing

Note that in this example, both the act of taking thing's address, and all use the of resulting pointer, visibly occur between the two chunks of code that use thing. Thus, there is no actual aliasing, and any compiler which is not willfully blind will have no trouble recognizing that the act of passing thing's address to reallocorFail might cause thing to be modified.

On the other hand, if the usage had been something like:

void **myptr;    
anyType *thing;

myptr = &thing;
... code chunk #1 that uses thing
*myptr = realloc(*myptr, newSize);
... code chunk #2 that uses thing

then even quality compilers might not realize that thing might be affected between the two chunks of code that use it, since there is no reference to anything of type anyType* between those two chunks. On such compilers, it would be necessary to write the code as something like:

myptr = &thing;
... code chunk #1 that uses thing
*(void *volatile*)myptr = realloc(*myptr, newSize);
... code chunk #2 that uses thing

to let the compiler know that the operation on *mtptr is doing something "weird". Quality compilers intended for low-level programming will regard this as a sign that they should avoid caching the value of thing across such an operation, but even the volatile qualifier won't be enough for implementations like gcc and clang in optimization modes that are only intended to be suitable for purposes that don't involve low-level programming.

If a function like reallocOrFail needs to work with compiler modes that aren't really suitable for low-level programming, it could be written as:

void resizeOrFail(void **p, size_t newsize)
{
  void *newAddr;
  memcpy(&newAddr, p, sizeof newAddr);
  newAddr = realloc(newAddr, newsize);
  if (!newAddr) fatal_error("Failure to resize");
  memcpy(p, &newAddr, sizeof newAddr);
}

This would, however, require that compilers allow for the possibility that resizeOrFail might alter the value of an arbitrary object of any type--not merely data pointers--and thus needlessly impair what should be useful optimizations. Worse, if the pointer in question happens to be stored on the heap (and isn't of type void*), a conforming compilers that isn't suitable for low-level programming would still be allowed to assume that the second memcpy can't possibly affect it.

A key part of low-level programming is ensuring that one chooses implementations and modes that are suitable for that purpose, and knowing when they might need a volatile qualifier to help them out. Some compiler vendors might claim that any code which requires that compilers be suitable for its purposes is "broken", but attempting to appease such vendors will result in code that is less efficient than could be produced by using a quality compiler suitable for one's purposes.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • Thank you for your very detailed answer! IIUC, if I don't target a specific compiler, but want the implementation to "standard-compliant", there is no solution (even with memcpy())? – rom1v Sep 02 '18 at 10:31
  • @rom1v: To be "standard compliant", one would need to perform the copy in a fashion that cannot possibly be described as being copied as a character array. The Standard doesn't say exactly what that means, but if gcc recognizes that a piece of code behaves in a fashion equivalent to memcpy, the same semantics end up applying. IIRC, given `uint16_t *src,*dest`, gcc sometimes optimized things like `int temp1 = ((char*)src)[0] + 1; int temp2 = ((char*)src)1] + 1; ((char*)dest)[0] = temp1 - 1; ((char*)dest) = temp2 - 1;`, into `*dest = *src;`, even when they weren't equivalent. – supercat Sep 02 '18 at 14:32
  • @rom1v: The Standard's tolerance for such things would be justifiable if and only if it made clear that they are *concessions for low-quality implementations*, and that any implementer who insisted upon the "right" to behave in such fashion would be admitting that they lack the skill to produce a quality implementation. – supercat Sep 02 '18 at 14:39