6

This question is an extension of what I have asked before. However, after a period of time, I find that some of my concepts about Conversion Behavior between Two Pointers are still ambiguous.

To facilitate the discussion, I first make the following assumptions about the host implementation:

  • malloc: 8-aligned
  • sizeof(int): 4, _Alignof(int): 4
  • sizeof(double): 8, _Alignof(double): 8

Question one:

void *ptr = malloc(4096);        // (A)

*(int *) ptr = 10;               // (B)               

/*
 * Does the following line have undefined behavior
 * or violate strict aliasing rules?
 */
*(((double *) ptr) + 2) = 1.618; // (C)

// now, can still read integer value with (*(int *) ptr)

In my current understanding, the answer is No.

According to [6.3.2.3 #7] of C11:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. ...

and [6.5 #7] of C11:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,
  • ...

Therefore, in my knowledge,

  • After line (A), I allocated an object that has no declared type and didn't yet have the effective type.
  • After line (B), the first 4 Bytes of the allocated object already have the effective type: int.
  • for line (C), the ptr is correctly aligned for the double type, the pointer casting and the pointer arithmetic is legal. Because it didn't access the first 4 Bytes, it didn't break the 6.5 #7 rule.

Do I have any misunderstandings about what I have mentioned above?


Question two:

void *ptr = malloc(4096);        // (A)

*(int *) ptr = 10;               // (B)

/*
 * Does the following line have undefined behavior
 * or violate strict aliasing rules?
 */
*(double *) ptr = 1.618;        // (C)

// now, shall not read value with (*(int *) ptr)

In my current understanding, the answer is also No.

According to [6.5 #6] of C11:

If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

So, in my knowledge, the line (C) is a subsequent access that modifies the stored value and updates the effective type of the first 8 Bytes to double. Do I have any misunderstandings about what I have mentioned above?

The main confusion is not sure whether there is a violation of the [6.5 #7] rules:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,
  • ...
Richard Bryant
  • 154
  • 2
  • 5
  • The effective type rule is pretty broken. Because it does say "If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value." But you do a write access here, so: "For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access." Meaning you could access the original object through double just fine too. – Lundin Apr 15 '21 at 14:43
  • I don't think they intended for the standard to be read that literally, but this part of the language is just broken. – Lundin Apr 15 '21 at 14:44
  • @Lundin: Re “Meaning you could access the original object through double just fine too”: The “all other accesses” part does not apply here. When they are doing `* (double *) ptr =`, it is covered by “If a value is stored into an object having no declared type through an lvalue having a type that is not a character type,” not the “all other” clause. If they are reading it after having written an `int`, it is covered by “… for subsequent accesses that do not modify the stored value” and again does not fall into the “all other” clause. – Eric Postpischil Apr 15 '21 at 14:50
  • @EricPostpischil Yes I agree, I referred to the Q2 by the OP. – Lundin Apr 15 '21 at 14:52
  • It might help to think of [6.5 #6] and [6.5 #7] being applied sequentially, so that if the effective type of the object is changed by #6, then it is already compatible with the type of the lvalue expression in #7. – Ian Abbott Apr 15 '21 at 17:12
  • @Lundin: If the aliasing rules were read as saying that storage which has accessed via some type *during the execution of some function or loop* may only be accessed *in that context* by an lvalue *which has a freshly-established visible relationship to* one of the indicated types, and if it had explicitly left recognition of such relationships as a quality-of-implementation issue, that would eliminate the need for the broken and unworkable Effective Type rule whose corner cases cannot be handled properly without foregoing a substantial number of what should be useful and safe optimizations. – supercat Apr 15 '21 at 17:18

3 Answers3

3

To facilitate the discussion, I first make the following assumptions about the host implementation [...]

These assumptions are almost completely irrelevant. The only constraint that matters for the particular questions posed is that sizeof(int) <= 2 * sizeof(double).

In particular, malloc() is guaranteed to allocate a block that is suitably aligned for any built-in type.

Question One:

Your analysis is correct: there is no strict-aliasing violation.

Question Two:

According to [6.5 #6] of C11:

If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the

type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

So, in my knowledge, the line (C) is a subsequent access that modifies the stored value and updates the effective type of the first 8 Bytes to double.

Yes, line (C) modifies the stored value of *(double *) ptr, and although ptr has a declared type, the object designated by *(double *) ptr, being part of a dynamically allocated block, does not. Therefore, by paragraph 6.5/6, the effective type of the object designated by *(double *) ptr becomes the type of the expression *(double *) ptr (that is, double) including for that access itself. The exception at the end of the paragraph serves to avoid a conflict between that and the effect of the access at your (B).

Thus, there is no strict-aliasing violation at (C). The lvalue used for access is *(double *)ptr. Its type is double, and according to 6.5/6, that is also the effective type of the object being accessed, notwithstanding any other effective type that that object or any part of it may have had. This satisfies the first alternative of the SAR.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • While a straightforward interpretation of the Standard's text might imply that storing a value of type T to an object will allow it to be read back as a type T regardless of any previous Effective Type it may have had, I don't think that's true of the dialect processed by clang or gcc. Does anything in the example included in my answer not conform to the Standard as written? – supercat Apr 15 '21 at 19:17
  • @supercat, I don't see anything non-conforming in your example, but I do not follow your complaint there, either. You seem to be objecting to optimizations performed by GCC and Clang, but the standard does not speak to that. Let's suppose a conforming hosted implementation where `sizeof(long) == sizeof(long long)` and that accepts the program, and a run in which the `malloc` succeeds. If I follow the code correctly then the standard requires only that the program print "2 should equal 2" followed by a newline. It does not require that all the memory updates actually be performed. – John Bollinger Apr 15 '21 at 23:21
  • Actually `__attribute` is an extension, and it does make the program non-conforming, but the above analysis would apply to the program as modified by removing the attribute declaration. – John Bollinger Apr 15 '21 at 23:31
  • In my example, both clang and gcc will output "1 should be 2" when optimizations are enabled. The `__attribute` does mean the program isn't a *Strictly* Conforming C Program, but the fact that its behavior is dependent upon whether `long` and `long long` are the same size would mean that anyway. The key point is to focus on how the compiler processes the `test()` function in cases where it doesn't know that the indices will be equal; both clang nor gcc appear to take two optimizations either of which would be fine in isolation, but combine them in a way that isn't valid. – supercat Apr 16 '21 at 07:04
  • If the godbolt tester made it possible to write a program that combined two compilation units, I'd simply write `test()` in one compilation unit and `main()` in the other. The `__attribute` exists to provide essentially the same effect, though using a volatile function pointer might be clearer. What's important for purposes of the recipe is that the compiler recognize that both uses of `index` will yield the same value, and all three uses of `index3` will yield the same value, but that it not know whether `index` and `index2` match, nor whether `index3` matches either `index` or `index2`. – supercat Apr 16 '21 at 07:12
  • @JohnBollinger: Very clear. Thanks y’all. – Richard Bryant Apr 16 '21 at 09:35
  • @supercat, I confirm that your program produces different output for me with GCC depending on optimization level used. I see no justification for that in the standard, and therefore I account it a bug in GCC. I can't speak to why the GCC maintainers might knowingly allow that nonconformity to persist, but I am inclined to doubt that they justify it by a plausible alternative interpretation of the standard. I take you to be warning that it is unwise to rely on implementations to conform in all areas where aliasing analysis is involved, and that warning is well made. – John Bollinger Apr 16 '21 at 12:42
  • @JohnBollinger: The fact that both gcc and clang exhibit the same behavior here suggests to me that it is by design. While each processes some corner cases correctly that the other does not, the number of circumstances where both exhibit the same nonsensical behavior is IMHO too large to be coincidental. I am also flabbergasted at the inability of either compiler to offer a mode that would limit optimizations to objects whose address is never used in any way that can't be fully tracked and accounted for. While that may only allow for 50-90% of the improvements that could be reaped... – supercat Apr 16 '21 at 14:36
  • ...compared with `-O0`, it would eliminate the extremely vast majority of code generation bugs. I suspect the failure to include such a mode stems from a fear that if it were included, the fraction of programs that enable the more clever optimizations would plummet, and it would instantly become apparent that the priorities of the clang/gcc development teams have for decades been at odds with what many of their customers actually want. – supercat Apr 16 '21 at 14:47
  • @JohnBollinger: BTW, I forget the defect report number, but there was a defect report whose authors suggested striking the "... that do not modify the stored value". I suspect that what has happened here and in some other parts of the language is that rather than trying to catalog all corner cases that implementations should support, the authors of the Standard chose to list corner cases that they thought could only be met by abstraction models that would also meet the other corner cases that should be supported. Clang and gcc, however, tried to design an abstraction model that would fit... – supercat Apr 16 '21 at 14:54
  • ...the mandated corner cases as narrowly as possible, and regard the aspects of the Standard that don't fit their abstraction model as defects in the Standard, rather than recognizing that the authors of the Standard was never intended to invite such a narrowly-drawn abstraction model in the first place. I wish the response to this Defect Report, and many others, had noted that the Standard's failure to mandate support for a particular construct as a condition of conformance does not imply any judgment as to whether an implementation can be suitable for any particular purpose without support. – supercat Apr 16 '21 at 15:08
2

While other answers do a reasonable job describing what the Standard would seem to say, both clang and gcc appear to interpret the phrase "subsequent accesses that do not modify the stored value" as though it said "subsequent accesses that do not change the stored bit pattern in a way which will later be observed". Both compilers are prone to take the sequence:

  1. Write storage with a T of value X using reference 1
  2. Write storage with a U of value Y using reference 2
  3. Read storage as type U using reference 3
  4. Optionally write storage with a T of some arbitrary value, using reference 3
  5. Write storage with a T whose bit pattern matches what was read in step #3, using reference 3
  6. Read the storage as type T using reference 1

as exemplified by the code:

typedef long long longish;
__attribute((noinline))
long test(long *p, int index, int index2, int index3)
{
    if (sizeof (long) != sizeof (longish))
        return -1;

    p[index] = 1;                          // Step 1
    ((longish*)p)[index2] = 2;             // Step 2
    longish temp2 = ((longish*)p)[index3]; // Step 3
    p[index3] = 5;                         // Step 4
    p[index3] = temp2;                     // Step 5
    return p[index];                       // Step 6
}
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
    long *arr = malloc(sizeof (long));
    long temp = test(arr, 0, 0, 0);
    printf("%ld should equal %ld\n", temp, arr[0]);
    free(arr);
}

and optimize out the write in step #4 (the bit pattern written here will never be observed, since it's overwritten by step #5), as well as the write in step #5 (once the write in step #4 is removed, the write in step #5 will no longer change the bit pattern). Once those writes are removed, the compilers will then assume that since no object of type T has been used to modify the object, they may optimize out the read in step #6. They will do this even if the references should be recognizable as being freshly derived, at each point of use, from a common pointer.

I see nothing in the Standard's terminology that would suggest that such an interpretation is valid or reasonable, but the maintainers of clang and gcc have known for years that they do not handle this corner case and so far as I can tell have made no attempt to accommodate the possibility that step 2 might legitimately overwrite the value written in step 1 if step 3 reads that bit pattern as a U and step 5 writes it as a T.

supercat
  • 77,689
  • 9
  • 166
  • 211
1

For question 1, there's no problem since you access a different object with no declared type. In both the int and double case, then "the type of the lvalue becomes the effective type of the object for that access".

For question 2, it says:

If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

Allocated storage has no declared type, you do access it through int but then later you do a modification through double. *((double *) ptr) = 1.618; isn't likely some read-modify-write - it's just a write (such concepts aren't even defined by C).

One perfectly sensible interpretation then is then that "for subsequent accesses that do not modify" does not apply and we should instead regarding it as a new lvalue access with a different effective type. If reading it all quite literally, there wouldn't be any strict aliasing violation.

But it's all ambiguous though; you may as well read this as: the compiler should keep track of all effective types internally and when you do an access through a non-compatible type or attempt to modify with a non-compatible type after the object with no declared type previously got an effective type, then that's UB.

This part of the standard 6.5/6 and /7 is simply not clear.


Practically, regardless of what the standard says, we can also see that the mainstream compilers do run off into the undefined behavior woods when we try this code with optimizations on:

#include <stdlib.h>
#include <stdio.h>

int main (void)
{
    void *ptr = malloc(4096);        // (A)

    *((int *) ptr) = 10;             // (B)

    /*
    * Does the following line have undefined behavior
    * or violate strict aliasing rules?
    */
    *((double *) ptr) = 1.618;       // (C)

   if( *((int *) ptr) == 10  )
     puts("Value didn't change.");
}

https://godbolt.org/z/jhxj7WqKW

  • gcc x86 says "Value didn't change." Until we drop -O3 then the behavior changes.
  • clang x86 doesn't generate a program since it thinks the value changed.
  • icc generates mov instructions despite optimizations and check the contents, then doesn't print anything.

3 different behaviors from 3 compilers, using the same code and same compiler options... So in practice, we must simply refer from fishy pointer conversions like this, because 22 years after C99, the compilers are still implementing strict aliasing in broken ways and I don't blame them since the standard is so ambiguously written.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    The sample code is not reflective of OP’s question; it reads memory written as a `double` as an `int`. OP’s question does not involve reading memory with any type different from the type it was written with. – Eric Postpischil Apr 15 '21 at 15:22
  • @Eric Postpischil, right, I'm going to improve my question to make it clearer. – Richard Bryant Apr 15 '21 at 15:38
  • @Lundin: Thanks for answering. I would ask why "for subsequent accesses that do not modify" does not apply to line (C) of Q2? In my knowledge, the term `access` in C11 (3.1) includes reading and "modifying", its meaning closer to "write" than R"M"W operation which modified value based on previous reading. – Richard Bryant Apr 15 '21 at 16:06
  • @Lundin: Right! you pointed out my confusion precisely. I don’t know if I can overwrite the old effective type of object having no declared type with the new effective type. If it can, Once an object is overwritten by the new effective type, is it legal as long as it doesn‘t use the lvalue with the old effective type to access the object? – Richard Bryant Apr 15 '21 at 16:19
  • @rici: Unfortunately, Union cannot be used in our case because we cannot enumerate all possible usage types. It can be assumed that the allocated object is a shared memory for IPC. Each round of communicate, the type of the object may be different. For this purpose, we need to overwrite the previous round's effective type with a new one. – Richard Bryant Apr 15 '21 at 16:30
  • 1
    @rici: Accessing a union member other than the last one stored is defined behavior in C (not C++). C 2018 6.5.2.3 3 says the value is “that of the named member,” and footnote 99 makes it clear: “If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning")…” – Eric Postpischil Apr 15 '21 at 16:32
  • @EricPostpischil indeed, although it would be UB if the accessed value happened to be a trap representation. – Ian Abbott Apr 15 '21 at 16:53
  • @RichardBryant: Sure, and I don't think it's a problem at all as long as you always read the type you last wrote. The problem, if there is one, is trying to read an object with "effective type" A through a pointer to type B – rici Apr 15 '21 at 18:29
  • @rici: In cases where a certain bit pattern is written as type U and later as type T, both clang and gcc are prone to behave as though the later write was done with type U, yielding nonsensical behavior if code attempts to read the object using the type T with which it was just written. – supercat Apr 15 '21 at 19:06
  • @supercat: I'm not quite sure I follow that. Do you mean "the same bit pattern" (as in, the compiler can figure out that the second modification is a no-op)? – rici Apr 15 '21 at 19:13
  • @Eric: Ok, good point. I was probably thinking of C++. – rici Apr 15 '21 at 19:16
  • @rici: See the example in my answer. The compiler may not only interpret the second modification as a no-op, but its presence may cause other writes of different bit patterns to also be converted to no-ops. IMHO, gcc and clang are both in dire need of an option to disable any optimizations they can't process reliably in a manner consistent with almost any reasonable interpretation of the Standard, without having to go all the way down to `-O0`-quality code generation. – supercat Apr 15 '21 at 19:22
  • I've never head it happen, but there are reports of this not working correctly without casting it to `(volatile double *)`. – Joshua Apr 15 '21 at 19:29
  • @Joshua: Jumping through hoops to satisfy the "strict aliasing rule" is a fool's errand. Quality compilers will use type-based aliasing to assume that *seemingly unrelated* objects won't alias, rather than as an excuse to ignore obvious relationships among pointers and objects of different types. For example, quality compilers will recognize that given `double *dp`, the assignment `*(unsigned*)dp = 2;` might affect an object of `double` [whoda thunkit?!] Compilers that won't handle that, won't reliably handle code which jumps through hoops to fit the "strict aliasing rule". – supercat Apr 15 '21 at 20:07
  • @EricPostpischil There should be no difference between int -> double and double -> int. In fact the original rationale for the aliasing rules were that compilers shouldn't need to be concerned that external linkage `int` wouldn't potentially be changed if a function modifies the content pointed at by a `double*` function parameter (see the C99 rationale). My example is the "light version" of that, since all identifiers here are local scope. Still, such code behaves irrationally on all 3 latest version mainstream x86 compilers. – Lundin Apr 16 '21 at 07:05
  • My comment has nothing to do with whether a `double` type is used first and then an `int` type versus an `int` type is used first and then a `double` type. The issue is that the OP asks about doing a **write** of one type followed by a **write** of another type, whereas the sample code introduces undefined behavior with a **read** of a type different from the last type written. OP’s question does not involve any read of a type different from the last type written. – Eric Postpischil Apr 16 '21 at 11:49
  • @EricPostpischil How else would you tell if the double write caused a change of effective type or not then? It doesn't matter until you read the data back. And if you do so using the same type as last time, the compiler will likely even optimize out the whole read/write. – Lundin Apr 16 '21 at 12:59
  • @Lundin: If what OP asks about, writing an `int` and then writing a `double`, is defined and does not violate aliasing rules, then the memory can reliably be subsequently read as a `double`. If what OP asks about, writing an `int` and then writing a `double`, is not defined, then the memory cannot reliably be subsequently read as a `double`. So the question is meaningful without supposing there is any read of a type other than the last one written. It is not asking about the situation in the sample code in this answer, where the memory is read using a type different from the last one written. – Eric Postpischil Apr 16 '21 at 13:52