25

Following my previous question, I'm really curious about this code -

case AF_INET: 
    {
        struct sockaddr_in * tmp =
            reinterpret_cast<struct sockaddr_in *> (&addrStruct);
        tmp->sin_family = AF_INET;
        tmp->sin_port = htons(port);
        inet_pton(AF_INET, addr, tmp->sin_addr);
    }
    break;

Before asking this question, I've searched across SO about same topic and have got mix responses about this topic. For example, see this, this and this post which say that it is somehow safe to use this kind of code. Also there's another post that says to use unions for such task but again the comments on accepted answer beg to differ.


Microsoft's documentation on same structure says -

Application developers normally use only the ss_family member of the SOCKADDR_STORAGE. The remaining members ensure that the SOCKADDR_STORAGE can contain either an IPv6 or IPv4 address and the structure is padded appropriately to achieve 64-bit alignment. Such alignment enables protocol-specific socket address data structures to access fields within a SOCKADDR_STORAGE structure without alignment problems. With its padding, the SOCKADDR_STORAGE structure is 128 bytes in length.

Opengroup's documentation states -

The header shall define the sockaddr_storage structure. This structure shall be:

Large enough to accommodate all supported protocol-specific address structures

Aligned at an appropriate boundary so that pointers to it can be cast as pointers to protocol-specific address structures and used to access the fields of those structures without alignment problems

Man page of socket also says same -

In addition, the sockets API provides the data type struct sockaddr_storage. This type is suitable to accommodate all supported domain-specific socket address structures; it is large enough and is aligned properly. (In particular, it is large enough to hold IPv6 socket addresses.)


I've seen multiple implementation using such casts in both C and C++ languages in the wild and now I'm uncertain of the fact which one is right since there are some posts that contradict with above claims - this and this.

So which one is the safe and right way to fill up a sockaddr_storage structure? Are these pointer casts safe? or the union method? I'm also aware of the getaddrinfo() call but that seems a little complicated for the above task of just filling the structs. There is one other recommended way with memcpy, is this safe?

Community
  • 1
  • 1
Abhinav Gauniyal
  • 7,034
  • 7
  • 50
  • 93
  • You should never call `reinterpret_cast` if you are not sure and in particular you should never do it on structures you have not defined (except if it clearly documented as valid). – Phil1970 Feb 11 '17 at 18:03
  • 2
    @Phil1970 documented by whom? posix? microsoft? both document that it will align without problems on a cast operation but reading different ques/ans/comments I'm not certain if compilers allow this.. – Abhinav Gauniyal Feb 11 '17 at 18:05

2 Answers2

40

C and C++ compilers have become much more sophisticated in the past decade than they were when the sockaddr interfaces were designed, or even when C99 was written. As part of that, the understood purpose of "undefined behavior" has changed. Back in the day, undefined behavior was usually intended to cover disagreement among hardware implementations as to what the semantics of an operation was. But nowadays, thanks ultimately to a number of organizations who wanted to stop having to write FORTRAN and could afford to pay compiler engineers to make that happen, undefined behavior is a thing that compilers use to make inferences about the code. Left shift is a good example: C99 6.5.7p3,4 (rearranged a little for clarity) reads

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If the value of [E2] is negative or is greater than or equal to the width of the promoted [E1], the behavior is undefined.

So, for instance, 1u << 33 is UB on a platform where unsigned int is 32 bits wide. The committee made this undefined because different CPU architectures' left-shift instructions do different things in this case: some produce zero consistently, some reduce the shift count modulo the width of the type (x86), some reduce the shift count modulo some larger number (ARM), and at least one historically-common architecture would trap (I don't know which one, but that's why it's undefined and not unspecified). But nowadays, if you write

unsigned int left_shift(unsigned int x, unsigned int y)
{ return x << y; }

on a platform with 32-bit unsigned int, the compiler, knowing the above UB rule, will infer that y must have a value in the range 0 through 31 when the function is called. It will feed that range into interprocedural analysis, and use it to do things like remove unnecessary range checks in the callers. If the programmer has reason to think they aren't unnecessary, well, now you begin to see why this topic is such a can of worms. (Modern compilers can optimize x << (y&31) into a single shift instruction for ISAs like x86 where the shift instruction implements that masking.)

For more on this change in the purpose of undefined behavior, please see the LLVM people's three-part essay on the subject (1 2 3).


Now that you understand that, I can actually answer your question.

These are the definitions of struct sockaddr, struct sockaddr_in, and struct sockaddr_storage, after eliding some irrelevant complications:

struct sockaddr {
    uint16_t sa_family;
};
struct sockaddr_in { 
    uint16_t sin_family;
    uint16_t sin_port;
    uint32_t sin_addr;
};
struct sockaddr_storage {
    uint16_t ss_family;
    char __ss_storage[128 - (sizeof(uint16_t) + sizeof(unsigned long))];
    unsigned long int __ss_force_alignment;
};

This is poor man's subclassing. It is a ubiquitous idiom in C. You define a set of structures that all have the same initial field, which is a code number that tells you which structure you've actually been passed. Back in the day, everyone expected that if you allocated and filled in a struct sockaddr_in, upcast it to struct sockaddr, and passed it to e.g. connect, the implementation of connect could dereference the struct sockaddr pointer safely to retrieve the sa_family field, learn that it was looking at a sockaddr_in, cast it back, and proceed. The C standard has always said that dereferencing the struct sockaddr pointer triggers undefined behavior—those rules are unchanged since C89—but everyone expected that it would be safe in this case because it would be the same "load 16 bits" instruction no matter which structure you were really working with. That's why POSIX and the Windows documentation talk about alignment; the people who wrote those specs, back in the 1990s, thought that the primary way this could actually be trouble was if you wound up issuing a misaligned memory access.

But the text of the standard doesn't say anything about load instructions, nor alignment. This is what it says (C99 §6.5p7 + footnote):

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:73)

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

73) The intent of this list is to specify those circumstances in which an object may or may not be aliased.

struct types are "compatible" only with themselves, and the "effective type" of a declared variable is its declared type. So the code you showed...

struct sockaddr_storage addrStruct;
/* ... */
case AF_INET: 
{
    struct sockaddr_in * tmp = (struct sockaddr_in *)&addrStruct;
    tmp->sin_family = AF_INET;
    tmp->sin_port = htons(port);
    inet_pton(AF_INET, addr, tmp->sin_addr);
}
break;

... has undefined behavior, and compilers can make inferences from that, even though naive code generation would behave as expected. What a modern compiler is likely to infer from this is that the case AF_INET can never be executed. It will delete the entire block as dead code, and hilarity will ensue.


So how do you work with sockaddr safely? The shortest answer is "just use getaddrinfo and getnameinfo." They deal with this problem for you.

But maybe you need to work with an address family, such as AF_UNIX, that getaddrinfo doesn't handle. In most cases you can just declare a variable of the correct type for the address family, and cast it only when calling functions that take a struct sockaddr *

int connect_to_unix_socket(const char *path, int type)
{
    struct sockaddr_un sun;
    size_t plen = strlen(path);
    if (plen >= sizeof(sun.sun_path)) {
        errno = ENAMETOOLONG;
        return -1;
    }
    sun.sun_family = AF_UNIX;
    memcpy(sun.sun_path, path, plen+1);

    int sock = socket(AF_UNIX, type, 0);
    if (sock == -1) return -1;

    if (connect(sock, (struct sockaddr *)&sun,
                offsetof(struct sockaddr_un, sun_path) + plen)) {
        int save_errno = errno;
        close(sock);
        errno = save_errno;
        return -1;
    }
    return sock;
}

The implementation of connect has to jump through some hoops to make this safe, but that is Not Your Problem.

[EDIT Jan 2023: What this answer used to say about sockaddr_storage was wrong and I'm embarrassed to admit I didn't notice the problem for six years.] It is tempting to use struct sockaddr_storage as a convenient way to know how big to make the buffer for a call to getpeername, in a server that needs to handle both IPv4 and IPv6 addresses. However, it is less error-prone and has fewer strict-aliasing problems if you use a union with each of the concrete address families you care about, plus plain struct sockaddr:

#ifndef NI_IDN
#define NI_IDN 0
#endif

union sockaddr_ipvX {
    struct sockaddr sa;
    struct sockaddr_in sin;
    struct sockaddr_in6 sin6;
};

char *get_peer_hostname(int sock)
{
    union sockaddr_ipvX addrbuf;
    socklen_t addrlen = sizeof addrbuf;

    if (getpeername(sock, &addrbuf.sa, &addrlen))
        return 0;

    char *peer_hostname = malloc(MAX_HOSTNAME_LEN+1);
    if (!peer_hostname) return 0;

    if (getnameinfo(&addrbuf.sa, addrlen,
                    peer_hostname, MAX_HOSTNAME_LEN+1,
                    0, 0, NI_IDN) {
        free(peer_hostname);
        return 0;
    }
    return peer_hostname;
}

With this formulation, not only do you not need to write any casts to call getpeername or getnameinfo, you can safely access addrbuf.sa.sa_family and then, when sa_family == AF_INET, addrbuf.sin.sin_*.

A final note: if the BSD folks had defined the sockaddr structures just a little bit differently ...

struct sockaddr {
    uint16_t sa_family;
};
struct sockaddr_in { 
    struct sockaddr sin_base;
    uint16_t sin_port;
    uint32_t sin_addr;
};
struct sockaddr_storage {
    struct sockaddr ss_base;
    char __ss_storage[128 - (sizeof(uint16_t) + sizeof(unsigned long))];
    unsigned long int __ss_force_alignment;
};

... upcasts and downcasts would have been perfectly well-defined, thanks to the "aggregate or union that includes one of the aforementioned types" rule. If you're wondering how you should deal with this problem in new C code, here you go.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • PERFECT! I still have 22 hours remaining to open a bounty, rest assured I'll award this answer one. "But maybe you need to work with an address family, such as `AF_UNIX`, that getaddrinfo doesn't handle." Exactly, I'm currently building a lib and since `AF_INET`, `AF_INET6` and `AF_UNIX` have lots of common functions.. eg `send`, `recv` and their friends which take a `sockaddr *` as parameter, I thought making a single `sockaddr_storage` variable and fill it conditionally and then passing it to underlying `sockaddr *` accepting calls. Now I can solve that by function overloading by creating .. – Abhinav Gauniyal Feb 12 '17 at 18:51
  • .. different functions that take `sockaddr_in`, `sockaddr_in6`, `sockaddr_un` structures respectively but they all will have exactly same code but different structures since I've abstracted away the difference with templates. I completely agree with you about creating exact structure for need eg `sockaddr_in` for `AF_INET` instead of general `sockaddr_storage` but that'll result in MxN problem where I've to maintain M functions for N different types with exactly same implementation. But I'll see if I can get past with runtime `if(AF_INET or other)` conditions or function overloading too.... – Abhinav Gauniyal Feb 12 '17 at 18:55
  • But let's say, forgive me for asking :p, if I need to follow down the `sockaddr_storage` general approach, will this method work - http://stackoverflow.com/a/42177009/2648679. Because I do have a case of datagram `recvfrom` call where I'll have to use `sockaddr_storage` since I'm not sure about the IPv4 mapped IPv6 addresses. Anyways I'm really grateful to you for this excellent answer and I'll provide a reference to it wherever possible. – Abhinav Gauniyal Feb 12 '17 at 19:00
  • I would need to see more of your code to understand what you are trying to accomplish and whether it is UB-safe. You should ask a new question about that. But I also want to reiterate that you should be using `getaddrinfo` and `getnameinfo`. To the extent I understand what you are trying to do, it really sounds to me like the majority of it is already handled by those functions. – zwol Feb 12 '17 at 19:18
  • [Declaration](https://github.com/c10k/net/blob/feature/socket/include/socket.hpp#L67) and [Definition](https://github.com/c10k/net/blob/feature/socket/src/socket.cpp#L135) and [creation of sockaddr_storage](https://github.com/c10k/net/blob/feature/socket/include/socket_family.hpp#L89) – Abhinav Gauniyal Feb 12 '17 at 19:19
  • The "first member" carveout lets you access `sin_family` safely even with strict aliasing and a cast to `sockaddr *`. There is no reason to make an initial `struct sockaddr` member. – tmyklebu Feb 13 '17 at 06:35
  • 1
    @tmyklebu I used to think so too, but neither gcc nor clang agrees. Counterexample at https://godbolt.org/g/WpQGQT ; see http://blog.regehr.org/archives/1466 and the paper it links to for extensive further analysis. – zwol Feb 13 '17 at 12:35
  • OK, that's nasty! But I'm not sure I believe the standard permits that "optimisation." You're accessing `a->sa_family`, which is an object of effective type `unsigned short`, via the lvalue `b->sin_family`, which is an lvalue expression of type unsigned short. This is explicitly allowed by C99 6.5.7. – tmyklebu Feb 13 '17 at 15:27
  • @tmyklebu Are you, though? Are you accessing an object of type `unsigned short` through an lvalue of type `unsigned short`? Or are you accessing an object of type `struct sockaddr` through an lvalue of type `struct sockaddr_in`? This gets into the "what exactly is an object" argument which, AFAIK, has never been resolved to _anyone's_ satisfaction, and the language in C2011 can still be read either way. – zwol Feb 13 '17 at 15:31
  • @tmyklebu Anyway, either the standard allows this, or there is a widespread and long-standing bug in multiple compilers, which the devs will probably claim is not a bug, and which must be worked around; the implications for application code are the same. I will think about how I can improve my answer on this point but it's already awful long. – zwol Feb 13 '17 at 15:33
  • 6.5.2 suggests that the following would be ill-formed if the relevant object was `struct sockaddr_in`: `(sin->sin_family = AF_INET) & (sin->sin_port = htons(12345))`. I have no reason to write code like that, but it seems insane to declare it ill-formed because I'm modifying the stored value of an object more than once between sequence points. – tmyklebu Feb 13 '17 at 15:35
  • @AbhinavGauniyal I reiterate that you need to ask a new question. I don't understand your library, even after having looked at the code, and the comment thread on this question is not the place for us to work it out. Also, please do not bother with a bounty. This is not a difficult question and you need the reputation more than I do. – zwol Feb 13 '17 at 15:36
  • Clang is good at having longstanding serious bugs that break tons of real code that the developers claim aren't bugs and which spawn papers about tools using clang to "find bugs in real code!" See https://people.csail.mit.edu/nickolai/papers/wang-undef-2012-08-21.pdf for example. – tmyklebu Feb 13 '17 at 15:38
  • Anyway, I buy that strict aliasing rules, as implemented, are respected by the "little bit differently" example you gave. I learned something---thanks! – tmyklebu Feb 13 '17 at 15:48
  • 2
    Note that while people in the 1990s may not have disputed that saying `struct sockaddr z = *p;` would have been UB if p was a pointer to something other than a `sockaddr`, I don't think people would have said the same about using `p->sa_family` to access that portion of the Common Initial Sequence. To the contrary, the ability to access members of the CIS through a structure pointer is a big part of what made the CIS *useful*. The notion that code should use `memcpy` for such purposes would have been widely recognized as dumb, especially since... – supercat Feb 13 '17 at 16:41
  • ...a compiler could assume that `p->sa_family` would be accessing a 16-bit integer value, but if code used `memcpy` to copy stuff to a `sockaddr` before accessing `sa_family` the compiler would have to allow for the possibility that it could alias anything anywhere. – supercat Feb 13 '17 at 16:43
  • PS--the desire of people to turn C into FORTRAN is particularly absurd when one considers the high-performance computing features that FORTRAN has that C lacks, and that adding directives to enable optimizations would allow better performance without affecting C's usability for embedded or systems code. An "optimization" which improves performance by 0.5% may be worthwhile if it imposes *zero* costs, but if it's necessary for programmers to add extra code to guard against the optimization it can easily become worse than useless. – supercat Feb 13 '17 at 16:52
  • @tmyklebu: The authors of gcc do not interpret the CIS rule as allowing pointers of one structure type to read members of the CIS of another, even when a complete union declaration containing both types is visible everywhere that either structure is accessed. For some reason, the notion that the visibility of such a union declaration should allow common-initial-sequence access was derisively regarded as a a silly and useless impediment to optimization, despite the fact that structure types are most likely to appear together in unions in circumstances where... – supercat Feb 13 '17 at 17:30
  • 1
    ...applying the CIS rule to pointers would be useful, and despite the fact that honoring such rule by default would in no way prevent gcc from including a directive to waive such treatment in cases where it isn't needed. I think some people view a default behavior which may generate sub-optimal code in the absence of an optimization directive, as more dangerous than a default behavior which might behave randomly in the absence of an optimization-blocking directive,. – supercat Feb 13 '17 at 17:37
  • @zwol: The authors of gcc claim that the standard is "ambiguous" and that they are therefore entitled to interpret it in a fashion which ignores the "complete union type" language therein. I would not call the willful refusal to abide by the Standard except when using `-fno-strict-aliasing` a "bug". Since it is normal for compilers to default to a non-conforming mode, the only "defect" is a failure to document include `-fno-strict-aliasing` among the options that must be specified to put the compiler into standard-conforming mode. – supercat Feb 13 '17 at 17:39
  • Incidentally, the rationale for C89 suggests that the rules were intended to permit compilers to make optimizations in cases *where a compiler would have no reason to expect them to matter*, e.g. consecutive accesses to a `float`, an indirected `int*`, and a `float`, *without any intervening actions that involve a `float*`*. Is there *any evidence* that the authors of the Standard intended that the rules force programmers jump through hoops to achieve things that could otherwise be done easily in their absence, or did they expect that compiler writers would try to recognize aliasing... – supercat Feb 13 '17 at 17:50
  • ...in cases where it was obvious (as was likely the case even on existing compilers that used type-based aliasing assumptions) and figure that there was no need to clutter the Standards with such details? – supercat Feb 13 '17 at 17:54
  • 3
    @supercat As I have said before, I'm not interested in discussing "how things should be" with you in the comments of this kind of answer, which are about "how things _are_ and how to deal with that." I'm especially not interested in defending or criticizing the behavior of compiler developers, which is a group of people that no longer includes me. – zwol Feb 13 '17 at 18:01
  • @zwol pardon me but I've one last doubt that I very much need to get clear, how about the case where I make a `sockaddr_storage ss` object and then `sockaddr_un *su = (sockaddr_un *)&ss;` and then fill it up `su->sun_family=AF_UNIX` and then call `bind()` or other system calls on `ss`. **Note** that I haven't yet broken the rule of strict aliasing since I have only filled, not dereferenced the pointer of different type. Will the system calls of socket family automatically use correct type according to `ss.ss_family` data? Is that member even correctly filled while using above code? – Abhinav Gauniyal Feb 18 '17 at 18:40
  • Here - https://books.google.co.in/books?id=xUs8s2-bNH8C&pg=PA548&lpg=PA548&dq=sockaddr_un+to+sockaddr_storage&source=bl&ots=B7c9eMsmwy&sig=yDj5-kuYFjbJT4vvSQYA8euFJ_8&hl=en&sa=X&ved=0ahUKEwips67To5rSAhWIt48KHcA-BAEQ6AEIPjAG#v=onepage&q=sockaddr_un%20to%20sockaddr_storage&f=false – Abhinav Gauniyal Feb 18 '17 at 18:40
  • @AbhinavGauniyal You break the rules as soon as you do `su->sun_family=AF_UNIX`. – zwol Feb 19 '17 at 13:40
  • Clang and gcc do not, in general, correctly handle situations where code passes the address of a union member to a function. They will handle such cases correctly in scenarios where the caller and callee are processed separately, and neither can play any role in code generation for the other, but those situations would also be handled correctly without aliasing issues with or without the union. – supercat Jan 19 '23 at 18:39
  • @supercat IMNSHO the code I wrote is required to work -- there's no ambiguity about the rules for type punning via a union. If it doesn't, that's bad on multiple levels. Do you have a self-contained testcase for the miscompilation you are describing? – zwol Jan 19 '23 at 22:47
  • The code may work by happenstance on today's versions of clang and gcc, but look at https://godbolt.org/z/rn51cracn for something not too different where gcc behaves, by design, in a manner contrary to what K&R2 would imply. If gcc could tell that the same subscript was being passed to `update_fancy1` as `read_simple`, it would generate sensible code, but I know of no specification that would suggest it should be expected to behave sensibly in that case if it can't be trusted in the general case. – supercat Jan 19 '23 at 23:49
  • @supercat Thanks, that's helpful. It seems to me that `&addrbuf.sa` and `(struct sockaddr *)&addrbuf` would be equally broken in this case, the issue being that, on the far side of a call to a function taking a `struct sockaddr *`, the fact that the pointer refers to one member of a union has been masked. Is that your experience? – zwol Jan 20 '23 at 18:33
  • @zwol: Yeah. While there are many circumstances where code taking the address of a union member would work, the only situation I know of where taking the address of a union member would work *but casting the union's address wouldn't also work* is when a union member is an array, the address is taken as a result of pointer decomposition, and *the implicitly-formed address is directly dereferenced using the `[]` operator*. GCC recognizes type punning given `someUnion.someArray[index]` but not `*(someUnion.someArray+index)` even though the Standard *defines* the former as meaning the latter. – supercat Jan 20 '23 at 18:41
  • @zwol: IMHO, the Standard should recognize a category of implementations that is unable to handle use of a pointer formed by taking a union member's address to access things of the member's type, and allow such implementations to make the address-of operator return a typeless pointer that can be assigned only to a `void*`. – supercat Jan 20 '23 at 19:14
8

Yes, it's an aliasing violation to do this. So don't. There's no need to ever use sockaddr_storage; it was a historical mistake. But there are a few safe ways to use it:

  1. malloc(sizeof(struct sockaddr_storage)). In this case, the pointed-to memory does not have an effective type until you store something to it.
  2. As part of a union, explicitly accessing the member you want. But in this case just put the actual sockaddr types you want (in and in6 and maybe un) in the union rather than sockaddr_storage.

Of course in modern programming you should never need to create objects of type struct sockaddr_* at all. Simply use getaddrinfo and getnameinfo to translate addresses between string representations and sockaddr objects, and treat the latter as completely opaque objects.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • @jww: I don't follow. FWIW the accepted answer to the question you told me to read is by yours truly. ;-) – R.. GitHub STOP HELPING ICE Feb 12 '17 at 00:15
  • 3
    Well, in modern programming, you should never need to compile without `-fno-strict-aliasing` at all. :) – tmyklebu Feb 13 '17 at 06:35
  • 1
    @tmyklebu: Failure to specify that option can cause gcc to generate bogus code *even for programs which never read any storage as any type other than what was used to write it*. That may not be deliberate, but it's indicative of a compiler design that may omit during early optimization stages code which wouldn't generate any instructions but is nonetheless required to block other "optimizations". – supercat Feb 13 '17 at 18:03
  • @tmyklebu: You don't need to. You just need to refrain from writing incorrect code, and almost all code using `sockaddr_storage` is incorrect. My answer covers a few of the non-incorrect ways you could go about using it, but stresses that you shouldn't be using it to begin with. – R.. GitHub STOP HELPING ICE Feb 13 '17 at 18:14
  • @R..: It's quite hard to "refrain from writing incorrect code." Especially when the only indication you've written incorrect code happens months later, in production, and apparently in an unrelated piece of code. The "safe idioms" for dealing with this involve writing `memcpy` when you mean assignment or doing all accesses via union. These generate illegible code that may be terribly slow and have different atomicity properties than intended if the compiler doesn't catch your meaning. Far safer and easier just to switch the land mines off. – tmyklebu Feb 13 '17 at 18:38
  • 1
    @R..: I am reminded of an Americanism: "Guns don't kill people; people kill people." Compilers don't make code fast; people make code fast. Strict aliasing, as interpreted by compiler vendors today, creates a barrier to optimisation (the kind done by people) by creating additional "gotchas" for the optimiser to avoid. We can already use `restrict` when we want to tell the compiler that accesses via distinct pointers in some set don't alias. – tmyklebu Feb 13 '17 at 18:47
  • @tmyklebu: To that I would add that there is a *huge* difference between the scenario offered in the C89 rationale as a justification for aliasing rules, versus the scenarios that modern compilers "optimize". A lot of people who despise modern interpretations of the rule have no objection to the notion that compilers should assume things won't alias *when there's no evidence that they will* (as was the case in the rationale's example). Some simple tweaks to the aliasing rules would fix the vast majority of code that presently requires `-fno-strict-aliasing` without... – supercat Feb 13 '17 at 20:01
  • @tmyklebu: If you have any casts of pointer types, you most likely have incorrect code. It's pretty damn simple to refrain from doing any. This case (`sockaddr_storage`) is a classic, textbook example of how much of UB comes from doing gratuitously idiotic things (poking at `sockaddr` internals) when you could just use the right approach (opaque objects) and have your code be much simpler and avoid the risk of anything going wrong. – R.. GitHub STOP HELPING ICE Feb 13 '17 at 20:08
  • @supercat: You're wrong. *Locally*, evidence of possible aliasing is not too hard to find. The problem is that the possibility of aliasing is not a local property. You can see an example where they *tried* to do what you suggested in the "union exception". Confusingly it only works when you directly access the union object and the original object is a union. It can't work when you take pointers to members of the union and access them from elsewhere (because the aliasing is not localized) and it can't work (despite many people naively expecting it to) when you cast a pointer to something else.. – R.. GitHub STOP HELPING ICE Feb 13 '17 at 20:12
  • ...significantly hurting optimization in most cases (the key concept would be that a cast from `T*` to non-void pointer type, either directly or through `void*`, should be treated as a potential access to a `T`). Defining all the corner cases would require some extra verbosity, but the key point is that pointer casts are seldom present in code which would be amenable to optimization, but are usually present in code which relies upon aliasing, and would thus provide an easy way for compilers to distinguish them. – supercat Feb 13 '17 at 20:12
  • ...to a pointer-to-union type in order to type-pun. The solution is not making more confusing localized-access-only exceptions to aliasing rules, but rather getting people to **stop type-punning**. It's wrong and it's useless. – R.. GitHub STOP HELPING ICE Feb 13 '17 at 20:13
  • A much more productive course of action for compilers would be to detect what aliasing they can (localized only) and generate the equivalent of `__builtin_trap()` for it, rather than trying to patch it up and make it work. Once wrong programs start crashing all over the place, people will fix them. And you have to either get pretty unlucky, or work hard at it, to write a program with no localized aliasing but distant non-local aliasing. – R.. GitHub STOP HELPING ICE Feb 13 '17 at 20:16
  • @R..: Most code that relies upon aliasing involve casting a pointer to some object and using the resulting pointer as the exclusive means of access to the object during the pointer's lifetime (similar to what `restrict` would imply). Optimizations that rely upon aliasing generally involve two identifiable accesses to an object and a presumption that no other accesses offer between them. Even if every pointer cast from `T*` to a non-void pointer or to a `void*` which leaves the compiler's sight were treated as breaking a connection between the preceding and following accesses to a `T`, ... – supercat Feb 13 '17 at 20:17
  • 2
    ...the costs of that would be trivial compared with the cost of `-fno-strict-aliasing`. Also, I would suggest that even if you don't find any aliasing constructs useful, the widespread usage of a number of aliasing patterns to do things that can't otherwise be done constitutes *prima facie* evidence that they are useful--just not to you. – supercat Feb 13 '17 at 20:20
  • @R..: Also, w/rt local vs. distant analysis, a compiler is required to make pessimistic assumptions about code it can't see. If a function accesses a `T*` directly, any caller that can't see what `T*` was accessed must assume that it may have accessed any `T` whose address it could acquire. What is difficult about saying that if a function casts a `T*` to some other type and can't tell what is done with it, callers should have to assume--as with the function that accessed a `T*` directly, that it might access a `T`? – supercat Feb 13 '17 at 20:42
  • @supercat: Under your idea, a function accessing data via `A*` and `B*` would always have to assume they might alias. You'd be stuck with awful codegen (spurious spills/reloads, no vectorization) unless you used `restrict` all over the place, i.e. the same thing people get when they foolishly use `-fno-strict-aliasing`. This is a real pain, as it makes it completely impractical to use "context" structures in-place because every access goes back to memory; you have to manually cache all the members in local variables. And all this to solve a non-problem. – R.. GitHub STOP HELPING ICE Feb 13 '17 at 22:27
  • @R..: Compilers generate comically bad code when they're left in charge of vectorising anything nontrivial. You still need people to vectorise code if you want to get good results consistently. Regarding the "spurious" spills and reloads, you've got a choice between them and silent generation of wrong code when you aren't expecting it. – tmyklebu Feb 13 '17 at 22:46
  • @R..: You're misinderstanding my proposed rule, which would allow a compiler to assume within the function that the arguments didn't alias *unless* a pointer of one type was cast to the other, in which case it would only have to consider aliasing between the cast and the first usage of the storage in question by some means other than by a pointer derived from the cast. If a compiler sees two accesses to an object of type `A`, and between those accesses there are no potential accesses to an `A*` or `char*`[etc.], and no casts from `A*` to some other type, then a compiler may assume that... – supercat Feb 13 '17 at 22:53
  • @supercat: That breaks when the cast and the access are non-local with respect to each other, which can easily happen in practice, and which is really going to mess up users' intuitions when it does. – R.. GitHub STOP HELPING ICE Feb 13 '17 at 22:57
  • ...nothing will alias the object of type `A` between those accesses. – supercat Feb 13 '17 at 22:59
  • @R..: If you think type punning is dumb and useless, try getting rid of it in Linux. Hell, try getting rid of it in `int float_to_int_bitwise(float)` and more complex code in the same vein without sacrificing readability even further. – tmyklebu Feb 13 '17 at 23:00
  • @R..: I'm not calling for compilers to support situations where the cast and usage of the cast value are separated by an access to the underlying value via means unrelated to the result of the cast. That's far less common than code which casts a pointer and then performs all the accesses that are going to occur using the result of the cast before the object is next accessed via unrelated means. – supercat Feb 13 '17 at 23:04
  • @tmyklebu: Regarding "spurious" spills, an even better choice between accepting them or bogus code would be to use `restrict` to let the compiler avoid spurious spills. Unlike type-based aliasing, that will even work when using character pointers. Type-based aliasing assumptions may allow optimizations in a few cases where `restrict` wouldn't be usable, so an option to have a compiler apply TBAA conservatively (rather than disabling it altogether) may be helpful, but in code that uses `restrict` effectively should generally work decently even without it. – supercat Feb 13 '17 at 23:48
  • @tmyklebu Re: "never need to compile without -fno-strict-aliasing": if I understand correctly, Linux kernel is built using `-fno-strict-aliasing`. – pmor May 11 '22 at 21:47
  • 2
    @pmor: Both clang and gcc use abstraction models that are incompatible with the Standard's rule that allows recycling of storage to hold different types. Defect Report 236 was proposed to change the Standard so that no storage which ever held any particular type could ever be used to hold any other, but that request was rejected. Despite that, clang and gcc continue to use an abstraction model where the Effective Type of storage that has written as type T and later as type U may later spontaneously revert to T, even if it is never read as any type other than the last written. – supercat Sep 30 '22 at 15:32
  • 2
    Further, gcc sometimes treats constructs like `if (flag) *(long)ptr = 1; else *(long long)ptr=1;` as equivalent to `*(long long)ptr=1;` and assumes they will never access a `long` even if `flag` is 1. Consequently, specifying any optimizations without using `-fno-strict-aliasing` will yield a language dialect which is should be viewed as only suitable for use with programs that never repurpose storage for use as different types, and I would trust code that specifies that it should only be compiled with `-fno-strict-aliasing` far more than I would trust code that doesn't specify that. – supercat Sep 30 '22 at 15:35
  • @supercat: Citation needed. That would be a serious bug if it does, and I'm pretty sure it does not do that. It skips the branch and emits a single version of the code for both, but the type-based aliasing knowledge can't assume the type afterwards, and I haven't seen any evidence that it does. If you have such evidence, **please** post it, as this is material for a critical CVE-worthy codegen bug report. – R.. GitHub STOP HELPING ICE Sep 30 '22 at 19:50
  • @R..GitHubSTOPHELPINGICE: See https://godbolt.org/z/83v4ssrn4 for an example of the described behavior. – supercat Sep 30 '22 at 20:01
  • @R..GitHubSTOPHELPINGICE: Also, see https://godbolt.org/z/jfv1Ge6v4 for an example of how clang and gcc are unable to abide the response to Defect Report 236. Function `test1()` writes storage with a long value which should be ignored. Function `test2()` writes the same storage with a `long long` and reads it back. Function `test3()` writes the storage twice with `long` and reads it back. All strictly conforming, but neither gcc can process it correctly. – supercat Sep 30 '22 at 20:37
  • @supercat: Lovely. Has this been filed on the bug tracker? If not, it's going to be soon... – R.. GitHub STOP HELPING ICE Sep 30 '22 at 20:45
  • 1
    @R..GitHubSTOPHELPINGICE: You should probably post both bugs on the bug tracker, since I don't have an account there, and am unaware of them having been posted recently. Since bugs I've squawked about that were put on the bug tracker years ago have yet to be fixed, I've given up on worrying about which bugs are on the reporting system or not. Given a choice between correctly supporting corner cases mandated by the Standard or performing optimizations that, while unsound, will "usually" work, I think the gcc maintainers would rather do the latter. – supercat Sep 30 '22 at 21:05
  • @R..GitHubSTOPHELPINGICE: To properly handle these and many other aliasing issues, gcc would need to treat type and data dependencies as a directed acyclic graph, but from what I can tell it instead treats them as an equivalence relation. A DAG-based abstraction would be able to block potential optimizing transforms in the few cases they'd cause trouble while applying them elsewhere, but an equivalence-based one cannot. – supercat Sep 30 '22 at 21:15
  • @supercat: I think another option would be using lower-precision knowledge for TBAA, e.g. just the "mode" (scalar type size) rather than the actual type to infer non-aliasing. – R.. GitHub STOP HELPING ICE Sep 30 '22 at 21:39
  • @supercat: Filed as the memorable bug number [107107](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107107) :-) – R.. GitHub STOP HELPING ICE Sep 30 '22 at 21:40
  • @R..GitHubSTOPHELPINGICE: Or an option to treat pointers, collectively, as one type, and all other types as compatible with each other but not with pointers, which is what CompCertC does, which would allow some optimizations in programs which would otherwise require `-fno-strict-aliasing`. On the flip side, a compiler which was paranoid in the presence of typecasts or `volatile`-qualified accesses could perform TBAA much more aggressively in their absence, while still being compatible with most non-portable programs. Add a mode to throw out the "character type" exception and performance... – supercat Sep 30 '22 at 22:03
  • ...could often be better than would be allowable in a conforming implementation, since even code which relies upon the character-type exception will almost always have type casts or other indications that a compiler needs to perform actions "literally" unless it can fully understand everything about the pointers involved. – supercat Sep 30 '22 at 22:04
  • @R..GitHubSTOPHELPINGICE: Nice write-up. Have you looked at the other example I posted, which shows another problem with collapsing code, in this case between the second statement of `test2()` and the second statement of `test3()`, inclusive? – supercat Sep 30 '22 at 22:13
  • @supercat: I haven't looked at it yet. Do you think it merits a different bug report (i.e. does it likely have a different root cause), or just comments on the same one about other affected constructs? – R.. GitHub STOP HELPING ICE Sep 30 '22 at 23:09
  • @supercat: Are you sure just the pointer/non-pointer thing would suffice? I would expect to hit the same bug with branches using `void *` vs `uintptr_t` lvalues to modify the object, where the code has perfectly well-defined behavior but breaks due to GCC collapsing the branches to the same code and losing track of the type of the lvalue the store happened through on the abstract machine. – R.. GitHub STOP HELPING ICE Sep 30 '22 at 23:12
  • @R..GitHubSTOPHELPINGICE: It's similar in that it reveals type-based aliasing logic that gets broken when operations get consolidated, but the nature of the operations being consolidated is different, and this example breaks in both clang and gcc. – supercat Oct 01 '22 at 05:28
  • @supercat: OK, comments on the first bug seem to agree it might be different so I'll file it separately. Thanks. – R.. GitHub STOP HELPING ICE Oct 01 '22 at 14:16
  • @R..GitHubSTOPHELPINGICE: If the compiler merges constructs that affect the bit patterns without accommodating different TBAA implementations, grouping types as equivalent won't fix all such potential problems. On the other hand, if there were an option to e.g. only treated pointers and "everything else" as the only two non-aliasable types, the only programs for which that would not be sufficient would be those that deliberately produced pointers with the same representations as integers or vice versa. – supercat Oct 01 '22 at 16:45
  • @R..GitHubSTOPHELPINGICE: On the other hand, my general perception of gcc is that if there programs which other compilers would process usefully, and which the Standard allows other compilers to process usefully, the maintainers of gcc and clang would rather argue about whether the Standard *requires* them to behave meaningfully, than acknowledge that a compiler which behaves meaningfully even when not *required* to do so may for many purposes be more useful than one which would use the Standard as an excuse to behave in meaningless fashion. – supercat Oct 01 '22 at 16:49
  • @supercat: I've now filed the second as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107115 – R.. GitHub STOP HELPING ICE Oct 01 '22 at 20:01
  • @R..GitHubSTOPHELPINGICE Was the second one filed to LLVM? I [cannot find it](https://github.com/llvm/llvm-project/issues?q=is%3Aissue+is%3Aopen+TBAA). – pmor Oct 04 '22 at 19:43
  • 1
    @pmor: If it does get reported to clang, I wonder if the maintainers will make a minimal tweak to fix the test case, or whether they'll actually fix the underlying problem which the test case is designed to illustrate? – supercat Oct 05 '22 at 05:50
  • 1
    @supercat Created: https://github.com/llvm/llvm-project/issues/58225. – pmor Oct 07 '22 at 09:48
  • @pmor: I was being snarky with that last remark, but it will be interesting to see what kind of fix emerges. It seems like people are working on gcc, but I find concerning the quote "Yep, we've been playing whack-a-mole already, so we should just continue and fix places that pop up with this issue" since that suggests a willingness to accept a design that will never be fundamentally sound. – supercat Oct 07 '22 at 14:49
  • @pmor: Ignore my last [deleted] comment. I had was A/B testing the two compiler versions and failed to select the latest. – supercat Oct 07 '22 at 15:21