7

Suppose A is a struct and I have a function to allocate memory

f(size_t s, void **x)

I call f to allocate memory as follows.

struct A* p;
f(sizeof(struct A), (void**)&p);

I wonder if (void**)&p here is a well-defined casting. I know that in C, it is well-defined to cast a pointer to void* and vice versa. However, I am not sure about the case of void**. I find the following document which states that we should not cast to a pointer with stricter alignment requirement. Does void** have stricter or looser alignment requirement?

1 Answers1

9

The conversion is not defined by the C standard, and, even if it were, code in f that assigned to it via the void ** type would not be defined by the C standard.

C 2018 6.3.2.3 7 says a pointer to an object type may be converted to a pointer to a different object type. This covers (void **) &p, since &p is a pointer to the object p, and void ** is a pointer to the object type void *. However, this paragraph only tells us the conversion may be performed. It does not full define what the result is. It says:

  • “If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.” This is generally not a problem; in common C implementations, the alignment requirements of void * and struct A * will be the same, and this is easily checked.

  • “Otherwise, when converted back again, the result shall compare equal to the original pointer.” This is all the paragraph tells us about the result of the conversion: It is a pointer you can convert back to struct A * to get the original pointer or its equivalent. It does not tell us the pointer can be used for anything else while it is in the void ** type.

  • “When a pointer to an object is converted to a pointer to a character type,…” This part of the paragraph does not apply, since we are not converting to a pointer to a character type.

So, suppose the function f has some code that uses its parameter x like this:

*x = malloc(…);

Because the standard did not define what will happen if x is used as a void ** for any purpose other than converting it back to struct A *, we do not know what *x will do.

A typical expectation is that *x will access the same memory p is in, but it will access it as a void * instead of as a struct A *. A technical problem here is that the C standard does not guarantee that a void * is represented in memory in the same way that a struct A * is represented in memory. As far as the standard is concerned, void * could use eight bytes while struct A * uses four bytes, or void * could use a flat byte address while struct A * uses a segment-and-offset address scheme. However, as with alignment, in common C implementations, different types of pointers have the same representation in memory, and this can be checked.

But then we arrive at the aliasing rule. Even if void * and struct A * have the same representation in memory, C 2018 6.5 7 says:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,

The list continues with several other categories of types, and none of them match the struct A * type of p. That is, this paragraph in the standard tells us the object p shall have its stored value accessed (“accessed” in the C standard includes both reading and writing) only by an expression that has one of the listed types. The expression used to access p in *x = malloc(…); is *x, and its type is void *, and void * is not compatible with struct A *, and void * is also not any of the other types listed in the paragraph.

So the code *x = malloc(…); breaks that rule. Violating a “shall” rule means the behavior of the code is not defined by the C standard.

Some compilers support breaking this rule, when a switch is used to ask them to support aliasing objects through different types. Using such a switch prevents some optimizations by the compiler. In particular, given two pointers x and y that point to different types not matching the aliasing rule, then compiler may assume they point to different objects, so it can reorder accesses to *x and *y in whatever way is efficient because a store to one cannot change the value in the other.

So, if you verify that void * and struct A * have the same representation and alignment requirement and that your compiler supports aliasing, then the behavior will be defined for the specific C implementation you check. However, it is not defined by the C standard generally.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • As a side question, will `(struct A*)(*x) = malloc(…);` make it valid? – Afshin Jul 13 '22 at 10:04
  • @Afshin No, that will not even compile. – Ian Abbott Jul 13 '22 at 10:07
  • @IanAbbott oops, my mistake. – Afshin Jul 13 '22 at 10:09
  • @Afshin: `* (struct A **) x = malloc(…);` would be fine as long as the alignment requirements (`void *` has at least the same alignment requirement as `struct A *`) are met, which they are in common C implementations. – Eric Postpischil Jul 13 '22 at 10:31
  • @EricPostpischil Why would the alignment of `void *` come into play? You aren't writing any `void*` here. – Goswin von Brederlow Jul 13 '22 at 16:46
  • @GoswinvonBrederlow: Suppose the alignment requirement of `struct A *` is two bytes and `p` is at address 102. Further suppose the alignment requirement of `void *` is four bytes and the compiler represents `void **` as a number of four-byte words, so 13 in the bytes of a `void **` means address 52. Then `(void **) &p` will convert the address of `p`, 102, to `void **` by dividing 102 by four, yielding 25, with the remainder discarded. When this is later converted to `struct A **`, it is multiplied by four, yielding 100, which is wrong; it is not the address of `p`, 102. – Eric Postpischil Jul 13 '22 at 17:44
  • Suppose we have two structs `A` and `B` with the same alignment requirement. And we have a variable `a` of type `A`. Does `struct B *pb = (struct B *)&a` have the representation problem mentioned above please? – user18676624 Jul 14 '22 at 01:58
  • @user18676624: Pointers to structures are special in that the C standard requires all pointers to structure types to have the same representation as each other. – Eric Postpischil Jul 14 '22 at 02:19
  • @EricPostpischil That isn't a problem of the alignment but representation. The alignment difference would seem to allow using different representations though. But have you ever seen such a C implementation? Seems like a problem only relevant to the language.lawyer tag. – Goswin von Brederlow Jul 14 '22 at 21:48
  • @GoswinvonBrederlow: That is a problem of alignment; there is a violation of the rule about alignment during conversion. The reason for that rule is representation: A pointer to a type with an alignment requirement of X is only required to represent addresses that are multiplies of X. Therefore, when another address is converted to that type, if it is not aligned correctly, the type might not be able to represent it. Therefore the conversion rule in the C standard must have a limitation on the alignment. I know of C implementations with different pointer representations for different types. – Eric Postpischil Jul 14 '22 at 22:17
  • @EricPostpischil Why are you arguing when I'm agreeing with you? – Goswin von Brederlow Jul 14 '22 at 22:20
  • The question was about `struct A**` vs `void**`. `struct A*` can be case to `void*` and back as can `char` and `struct { char c; }` pointer. So I would argue `void*` must use the same representation as `struct` pointer with byte granularity. So `struct A**` and `void**` would have the same size and alignment. Why would their representation ever differ? Note: `struct A**` might very well only store multiples of 4 but then `void**` would for the same reasons. It doesn't make sense to save bits in one but not the other. – Goswin von Brederlow Jul 14 '22 at 22:39
  • `void *` is required to be able to represent any address. A pointer to a `struct` is not required to be able to represent any address. Therefore the demands on `void *` and `struct foo *` are different, so they could have different representations. In a word-oriented C implementation, `struct foo *` might consist of only a word number, whereas `void *` would have extra bits for the byte-within-the-word. And since `struct foo *` and `void *` would be different, the implementors might make different choices about the properties of `struct foo **` and `void **`. – Eric Postpischil Jul 14 '22 at 23:12
  • Sorry, I have another question. If I cast `struct foo*` to `void *`, is it well-defind to use the resulting pointer of `void *` type for something other than converting it back to `struct foo*`? I find that the parameter of `free` function is of `void *` and thus it seems that answer is yes. – user18676624 Jul 15 '22 at 02:24
  • @user18676624: A pointer that originated as a `struct foo *` can be converted to a pointer to the type of the first member of the structured and used to access that member. It can also be converted to a pointer to a character type and used to access the bytes that represent the structure. – Eric Postpischil Jul 15 '22 at 15:03
  • @EricPostpischil Thanks a lot. Could you please show me the part of the standard that covers this point? – user18676624 Jul 18 '22 at 01:22
  • @user18676624: C 2018 6.7.2.1 15 says “… A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa…” 6.3.2.3 7 says “… When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object…,” and accessing those bytes is defined by 6.5 7. – Eric Postpischil Jul 19 '22 at 22:17