4

In zwol's answer to Is it legal to implement inheritance in C by casting pointers between one struct that is a subset of another rather than first member? he gives an example of why a simple typecast between similar structs isn't safe, and in the comments there is a sample environment in which it behaves unexpectedly: compiling the following with gcc on -O2 causes it to print "x=1.000000 some=2.000000"

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

struct base
{
    double some;
    char space_for_subclasses[];
};
struct derived
{
    double some;
    int value;
};

double test(struct base *a, struct derived *b)
{
    a->some = 1.0;
    b->some = 2.0;
    return a->some;
}

int main(void)
{
    size_t bufsz = sizeof(struct base);
    if (bufsz < sizeof(struct derived)) bufsz = sizeof(struct derived);
    void *block = malloc(bufsz);

    double x = test(block, block);
    printf("x=%f some=%f\n", x, *(double *)block);
    return 0;
}

I was fooling around with the code to better understand exactly how it behaves because I need to do something similar, and noticed that marking a as volatile was enough to prevent it from printing different values. This lines up with my expectations as to what is going wrong - gcc is assuming that a->some is unaffected by the write to b->some. However, I would have thought gcc could only assume this if a or b were marked with restrict.

Am I misunderstanding what is happening here and/or the meaning of the restrict qualifier? If not, is gcc free to make this assumption because a and b are of different types? Finally, does marking both a and b as volatile make this code compliant with the standard, or at least prevent the undefined behaviour from allowing gcc to make the aforementioned assumption?

Daniel McIntosh
  • 522
  • 3
  • 17
  • 1
    Also see [What is the strict aliasing rule](https://stackoverflow.com/a/51228315/1708801) – Shafik Yaghmour Dec 10 '18 at 03:52
  • @ShafikYaghmour so is this type of casting UB only because of the strict aliasing violation in test? That is, if no two pointers of differing type reference the same block of memory in the same scope, is this type of casting safe to use to implement inheritance? Also, does marking variables volatile make any difference to the strict aliasing violation? – Daniel McIntosh Dec 10 '18 at 04:25
  • Using `volatile` is not guaranteed to make it work. – Jonathan Leffler Dec 10 '18 at 04:47
  • The problem here is that on the specific compiler you used, the undefined behavior only manifested itself when you had optimizations enabled. And then `volatile` seemed to "fix" it. As for `restrict`, it is only there to enable further optimizations of already valid code - it cannot be used to somehow make a certain illegal kind of type punning valid. – Lundin Dec 10 '18 at 12:09
  • @JonathanLeffler How can a compiler not make it work with volatile? – curiousguy Dec 11 '18 at 21:13

2 Answers2

3

If a region of storage is accessed exclusively using volatile-qualified lvalues, a compiler would have to go extremely far out of its way not to process every write as translating the values written to a pattern of bits and storing it, and every read as reading a bit pattern from memory and translating it into a value. The Standard does not actually mandate such behavior, and in theory a compiler given:

long long volatile foo;
...
int test(void)
{
  return *((short volatile*)(&foo));
}

could assume that any code branch that could call test will never be executed, but I don't yet know of any compilers that behave in such extreme fashion.

On the other hand, given a function like the following:

void zero_aligned_pair_of_shorts(uint16_t *p)
{
  *((uint32_t void volatile*)&p) = 0;
}

compilers like gcc and clang will not reliably recognize that it might have some effect upon the stored value of an object which is accessed using an unqualified lvalue of type uint16_t. Some compilers like icc regard volatile accesses as an indicator to synchronize any register-cached objects whose address has been taken, because doing so it a cheap and easy way for compilers to uphold the Spirit of C principle described in the Standards' charter and rationale documents as "Don't prevent the programmer from doing what needs to be done" without requiring special syntax. Other compilers like gcc and clang, however, require that programmers either use gcc/clang-specific intrinsics or else use command-line options to globally block most forms of register caching.

supercat
  • 77,689
  • 9
  • 166
  • 211
0

The problem with this particular question and zwol's answer is that they conflate type punning and strict aliasing. Zwol's answer is correct for that particular use case, because of the type used to initialize the structure; but not in the general case, nor wrt. struct sockaddr POSIX types as one might read the answer to imply.

For type punning between structure types with common initial members, all you need to do is to declare (not use!) an union of those structures, and you can safely access the common members through a pointer of any of the structure types. This is the explicitly allowed behaviour since ANSI C 3.3.2.3, including C11 6.5.2.3p6 (link to n1570 draft).

If an implementation contains an union of all struct sockaddr_ structures visible to userspace applications, zwol's answer OP links to is misleading, in my opinion, if one reads it to imply that struct sockaddr structure support requires something nonstandard from compilers. (If you define _GNU_SOURCE, glibc defines such an union as struct __SOCKADDR_ARG containing an anonymous union of all such types. However, glibc is designed to be compiled using GCC, so it could have other issues.)

Strict aliasing is a requirement that the parameters to a function do not refer to the same storage (memory). As an example, if you have

int   i = 0;
char *iptr = (char *)(&i);

int modify(int *iptr, char *cptr)
{
    *cptr = 1;
    return *iptr;
}

then calling modify(&i, iptr) is a strict aliasing violation. The type punning in the definition of iptr is incidental, and is actually allowed (because you are allowed to use the char type to examine the storage representation of any type; C11 6.2.6.1p4).

Here is a proper example of type punning, avoiding strict aliasing issues:

struct item {
    struct item *next;
    int          type;
};

struct item_int {
    struct item *next;
    int          type; /* == ITEMTYPE_INT */
    int          value;
};

struct item_double {
    struct item *next;
    int          type; /* == ITEMTYPE_DOUBLE */
    double       value;
};

struct item_string {
    struct item *next;
    int          type;    /* == ITEMTYPE_STRING */
    size_t       length;  /* Excluding the '\0' */
    char         value[]; /* Always has a terminating '\0' */
};

enum {
    ITEMTYPE_UNKNOWN = 0,
    ITEMTYPE_INT,
    ITEMTYPE_DOUBLE,
    ITEMTYPE_STRING,
};

Now, if in the same scope the following union is visible, we can type-pun between pointers to the above structure types, and access the next and type members, completely safely:

union item_types {
    struct item         any;
    struct item_int     i;
    struct item_double  d;
    struct item_string  s;
};

For the other (non-common) members, we must use the same structure type that was used to initialize the structure. That is why the type field exists.

As an example of such a completely safe usage, consider the following function that prints the values in a list of items:

void print_items(const struct item *list, FILE *out)
{
    const char *separator = NULL;

    fputs("{", out);        

    while (list) {

        if (separator)
            fputs(separator, out);
        else
            separator = ",";

        if (list->type == ITEMTYPE_INT)
            fprintf(out, " %d", ((const struct item_int *)list)->value);
        else
        if (list->type == ITEMTYPE_DOUBLE)
            fprintf(out, " %f", ((const struct item_double *)list)->value);
        else
        if (list->type == ITEMTYPE_STRING)
            fprintf(out, " \"%s\"", ((const struct item_string *)list)->value);
        else
            fprintf(out, " (invalid)");

        list = list->next;
    }

    fputs(" }\n", out);
}

Note that I used the same name value for the value field, just because I didn't think of any better one; they do not need to be the same.

The type-punning occurs in the fprintf() statements, and are valid if and only if 1) the structures were initialized using structures matching the type field, and 2) the union item_types is visible in the current scope.

None of the current C compilers I've tried have any issues with the above code, even at extreme optimization levels that break some facets of standard behaviour. (I haven't checked MSVC, but that one is really a C++ compiler, that can also compile most C code. I would be surprised, however, if it had any issues with the above code.)

Nominal Animal
  • 38,216
  • 5
  • 59
  • 86
  • @Lundin: No worries. I did mention it between the two code blocks just to try and make it clear how important it is (to have it visible), and again in the next-to-last paragraph. – Nominal Animal Dec 10 '18 at 12:08
  • This answer does not answer the question at all—it never mentions `volatile`. – Eric Postpischil Dec 10 '18 at 12:56
  • I don't have time to respond in detail to this right now but I am pretty sure you're wrong. I think the distinction between "type punning" and "strict aliasing" you're trying to make is not actually present in the text of the standard, and I _know_ that it is insufficient to "declare (not use!) a union of [structures with the same common initial subsequence]", because there was a giant argument about that exact thing back in 2003 ± 2 on the gcc mailing list and the conclusion was you had to _use_ the union, anything else was intractable for the compiler. I may come back to this on Friday. – zwol Dec 10 '18 at 12:56
  • I appreciate your taking the time to write up a detailed rebuttal of my old answer, though. I hope we can come to an understanding and an ultimate answer that is better than what either of us produced originally. – zwol Dec 10 '18 at 12:58
  • @zwol: I do agree that the text of the standard is vague (and is a major reason why I deliberately nowadays avoid questions that are marked [tag:language-lawyer]). The problem is, I am definitely not going to accept GCC developers' opinion (specifically those in that time frame) as gospel as to how the standard should be interpreted. Just because it is "intractable for the compiler" does not fly, because they themselves often respond to requests for trivial changes that make specific required behavior sane as "no, because the standard says it is UB". – Nominal Animal Dec 10 '18 at 14:02
  • @zwol: That said, if you can get C standard committee reports, that's a different thing (and if you do, let me know, because I will then edit quite a few of my old answers). As always, I do trust your findings; I only disagree with the current conclusions. – Nominal Animal Dec 10 '18 at 14:04
  • @EricPostpischil: I don't address the misguided part of the question, no. I believe that answering the underlying question that vexes the asker is a better approach, because it has the larger possibility of helping others trying to find information on similar problems. As usual, I do encourage you to submit your own that does answer the question in a way you think best; in fact, I think it might help those who are only interested in getting a yes/no answer they can use in a test. I personally am not interested in helping them. – Nominal Animal Dec 10 '18 at 14:07
  • 1
    @NominalAnimal: According to published charter and rationale documents, the Standard is intended to uphold the Spriit of C, including the principle "Don't prevent the programmer from doing what needs to be done". The Standard has never made any attempt to forbid implementations from acting in ways fundamentally contrary to the Spirit of C, however. Instead, it allows implementations to uphold the Spirit of C, and treats the extent to which they do so Quality of Implementation issue, outside its jurisdiction. – supercat Dec 10 '18 at 18:10
  • @NominalAnimal: The Standard would in fact allow a conforming implementation to be incapable of processing any useful programs, even though that would prevent programmers from doing anything useful they might need to do. The fact that a *conforming* implementation may obstruct programmers from doing what's necessary to accomplish some purpose doesn't mean that a compiler that behaves that way could be a high-quality implementation suitable for that purpose. – supercat Dec 10 '18 at 18:13
  • 1
    @supercat: That is why I do not trust future versions of the C standard to go in a sane direction, especially when the committee is stuffed by vendor representatives. Compilers are tools, and the point of standardization is to make different makes of said tools compatible and useful. Leaving *that* as a "quality of implementation issue" is just a cop-out, to allow vendors the "Standard" badge, and leave all the problems to the end-users to find out. – Nominal Animal Dec 11 '18 at 04:57
  • @NominalAnimal: The choice of what features an implementation supports *should* be a "quality of implementation" issue, but implementations should be required to document their choices via both human-readable and "machine-readable" means, and the Standard should make clear that quality implementations that claim to be suitable for various purposes will not interfere with programmers trying to achieve those purposes. The problem is that compiler writers refuse to acknowledge the responsibility that "QoI" puts upon them. – supercat Dec 11 '18 at 14:42
  • Passing the address of one member of a union to one function, and then after that function returns passing the address of a different member to another function is a construct that nobody seeking to write a quality compiler should have any trouble supporting, but neither clang nor gcc can handle it. – supercat Dec 11 '18 at 18:51
  • "_the union item_types is visible in the current scope._" Requiring the "visibility" of type definitions that are not used is an insane idea according to a few C++ committee members, some compiler writers and many people incl. me. Also, one commonly used compiler doesn't promise to take that into account: "_The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with -fstrict-aliasing, type-punning is allowed, provided **the memory is accessed through the union type**. (...) However, this code might not:_" – curiousguy Dec 24 '18 at 06:54
  • (...) Not only the code that follows has the union definition "visible", its **used** in basic unit that does the type pun, but the pun read is done via a pointer dereference not an lvalue expression that's a member access; so we understand that "through the union type" implies: **through a member access expression** of a member of that punning union. (I forgot to [link to GCC doc](https://gcc.gnu.org/onlinedocs/gcc-4.0.2/gcc/Optimize-Options.html) ) – curiousguy Dec 24 '18 at 06:57
  • @curiousguy: I do not consider C++ committee members opinions on C relevant. But thanks for the downvote and repeating the same argument once again. – Nominal Animal Dec 24 '18 at 11:19
  • @NominalAnimal I don't know I was "repeating the same argument" – curiousguy Dec 24 '18 at 11:33
  • @curiousguy: Yes, you are. The "requiring the visibility ..." idea is the only sane one, **because it corresponds to the real-world use cases we have today**. Your opinion is borne from the mistaken idea that C developers must follow the standards set by the committee and compiler writers -- probably because you only work within the Microsoft "ecosystem". If a future version of the standard breaks a large swath of existing code (basically all of POSIX and Berkeley sockets), it does not mean that that swath of code gets rewritten; it means the standard won't get adopted (except by MS, [...] – Nominal Animal Dec 24 '18 at 11:57
  • that has spent quite a lot of money to stuff the standards committee, in an effort to balkanize C, especially to keep POSIX as far away from C as possible. This is mainly because of historical reasons: after all, Microsoft does not have a C compiler at all, only a C++ compiler that can compile some C code.) Similarly, if your compiler writer friends decide to do something stupid, they won't "force the rest to follow"; they'll just expressly obsolete themselves. Existing codebase rules, you see; people who use C as a tool have no time for committee games. – Nominal Animal Dec 24 '18 at 12:00
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/185724/discussion-between-curiousguy-and-nominal-animal). – curiousguy Dec 24 '18 at 12:24