0

Strict aliasing makes me paranoid. There are times when I set values with an *int pointer and expect the targeted memory to read the same data no matter what the reading pointer type is. Strict aliasing doesn't guarantee this and sometimes even cause this to not be the case.

If I'm reading a char[] in a loop and there's an *int chainging something in that char[] array I'm breaking the aliasing rules among other standard C things.

I am making a JIT-compiler and since I'm using x86 I'm sure that I don't have to care about int-alignment. Let's keep that out of the equation until we've sorted out the aliasing problem.

Consider this snippet:

unsigned char x86[] = {0x11, 0x44, 0x42, ... };
uint32_t *specific_imm = (x86+10);

Now, *specific_imm = 42; on an x86 platform is still UB because the compiler is allowed to assume that *specific_imm isn't aliasing with x86[]. By making that assumption, it doesn't need to set those bytes right away but may do all kinds of optimizations. Setting both x86[] and *specific_imm as volatile would solve my problem but that's not good enough since I want to learn C properly.

We have addressed the aliasing problem now. Some suggest this solution: memcpy(x86+10,specific_imm, 4);

But the C standard seems to have a problem with that too regarding aliasing pointers (if I've understood things correctly) as illustrated by the following code.

/* naive implementation of memcpy */
inline void _memcpy(unsigned char *a, unsigned char *b){
  *a = *b;
}

int main(void) {
  long i = 0xFFFFFFFF;
  unsigned char c = 1;
  ++i;
  _memcpy(&c,&i);
  return c;
}

Since the compiler is free to assume that 'i' isn't affecting c somehow in this case(?), main is free to be optimized to just return 1?

I'm more interested in addressing the problem before jumping straight to solutions.

Thanks in advance

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    I suggest you fix you example, you probably saw my comments in the answer: The value of i isn't obvious. – 2501 Jul 07 '16 at 10:26

2 Answers2

1

You are wrong. A C compiler can not assume that an arbitrary pointer and a pointer to a variation of char are not aliased. It also cannot assume that two pointers to signed and unsigned int, or two pointers to signed and unsigned long etc. are not aligned.

In your last example, any sane software developer has his compiler warnings set up in such a way that this doesn't get compiled.

gnasher729
  • 51,477
  • 5
  • 75
  • 98
  • `are not aligned` should that be `are not aliased`? or am I missing something important here? – Support Ukraine Jul 07 '16 at 11:12
  • You can get UB without warnings with GCC regarding aliased pointers. – jdoeblink33 Jul 07 '16 at 11:41
  • @jdoeblink33: You can also get bogus behavior in cases where gcc ignores aspects of the Standard they don't like (e.g. if code needs to be able to able to access the common initial sequence of multiple structure types, the Standard specifies that declaring a union type that contains those structures will allow any such type to read members of any other, but since since the authors of gcc don't like that rule they just ignore it. – supercat Jul 07 '16 at 15:03
1

By making that assumption, it doesn't need to set those bytes right away but may do all kinds of optimizations

It doesn't need to set them at all. It can do anything.


Setting both x86[] and *specific_imm as volatile would solve my problem

Not really. Strict aliasing says that a certain variable may not be changed through pointers to unrelated types. Doing so causes your program to do things not specified by the standard. Usually this manifests itself in various optimizer-related bugs, but not necessarily. The program might as well do nothing, or crash and burn.

volatile will not fix this (especially since you declare the pointer as something pointing to volatile data, rather than making the actual data variable volatile).

Some compilers like GCC optimize code with the assumption that your program will never violate strict aliasing (and thereby invoke undefined behavior). But that doesn't mean that shutting optimization off will remove the undefined behavior itself, it will only shut off the optimizer reliance which is assuming that your program is not invoking undefined behavior. It will not fix the actual bug.


Some suggest this solution: memcpy

This will solve the problem, because of the rules of effective type. 6.5/6:

If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.

This satisfies the first part of the strict aliasing rule, 6.5/7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,


But the C standard seems to have a problem with that too regarding aliasing pointers (if I've understood things correctly)

No, that is not correct. The real memcpy function uses void pointers and cannot violate strict aliasing for the reasons cited above. Your home-brewed version uses unsigned char*, which is fine too, by 6.5/7:

— a character type.

Please read What is the strict aliasing rule?, particularly this answer.

Community
  • 1
  • 1
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • If the authors of the Standard had specified that when the source operand to `memcpy` is a type other than `void*`, the pointer type must be appropriate to the source, and likewise for the destination, that would have accommodated most uses of memcpy for type punning without requiring pessimistic aliasing assumptions. Unfortunately, the actual rules for `memcpy` allow opportunities for compiler mischief without allowing many opportunities for useful optimizations. – supercat Jul 07 '16 at 15:00