2

First of all we talk about gcc/linux(x86, amd64) and c99.

Here is the code:

#include <stdint.h>

void f(void *p)
{
  uint32_t *num = p;
  *num = 17;
}

int main()
{
  char buf[8] __attribute__ ((aligned (8)));
  f(&buf[3]);
}

The question is it UB?

From one hand Intel CPU allow unalign access, from another hand I found this: http://www.uclibc.org/docs/psABI-i386.pdf http://www.x86-64.org/documentation/abi.pdf and both of them mention 4 byte align for 4byte integer.

So even if it compiles and works fine is it still UB? Because of gcc thinks that value of "uint32_t *" pointer point to 4 byte aligned address, and use, for example "SSE" in "f" function without hesitate?

The gcc maintainers think that such code is "undefined code": https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66194

Original question was from here (Russian language): https://ru.stackoverflow.com/questions/268888/%D0%9E%D0%BF%D1%8F%D1%82%D1%8C-%D0%BE%D0%B1-%D0%BE%D0%BF%D1%82%D0%B8%D0%BC%D0%B8%D0%B7%D0%B0%D1%86%D0%B8%D0%B8-gcc-o3-intel-i5-2500-segmentation-fault

Community
  • 1
  • 1
fghj
  • 8,898
  • 4
  • 28
  • 56
  • http://stackoverflow.com/q/9608171/995714 gcc will assume that the pointer is not aligned unless you had specified – phuclv May 18 '15 at 10:28
  • Actually in link above, gcc assume "float*" align 4, not "not aligned", plus this topic about ARM architecture, and it is not clear would be things the same for intel. – fghj May 18 '15 at 10:43
  • You could always disassemble the code and see what instructions it yielded. – Lundin May 18 '15 at 11:10
  • In real code gcc generate instruction that cause seg fault (it uses SSE), so the answer to question will be answer to is it gcc bug or it is programmer who wrote code bug. – fghj May 18 '15 at 11:12
  • For code like this, which is explicitly causing misaligned access, I think it is asking too much of the compiler to have it automatically convert the code into something correctly aligned. It is rather a case where the compiler should let go, because the programmer has told it that they know what they are doing. Whether they actually do know what they are doing or not depends on the specific CPU: if the CPU can't handle it then it is a programmer caused bug, as is always the case with undefined behavior. – Lundin May 18 '15 at 11:16
  • I think it is not too much, if on specific os/CPU compiler works with unaligned "int32_t*", then it should care about alignment, and generate code that create two branches for example for aligned or not aligned case, icc/clang handle this correctly, gcc not. So I want to know is it gcc bug, or somewhere in gcc documentation there is note that it think that "int32_t*" point to align on 4 bytes value. – fghj May 18 '15 at 11:21

2 Answers2

1

Compiles and links fine means nothing, undefined behavior occurs in run-time. As far as the C standard is concerned, the code may or may not invoke UB.

6.3.2.3/7:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined.

The term "correctly aligned" depends on CPU architecture. If your CPU has no alignment restrictions, the uint32_t pointer is to be regarded as correctly aligned, and then you would have no UB.

If the CPU allows misaligned access, but it results in less effective code, then you will have no UB either.

But if the CPU does not support misaligned access, then the code is UB.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Your answer is method to get answer, not answer. As you see in question there are links to ABI (but I'm not sure is applicable for gcc/linux(x86/amd64) case or not), but for that reason I mention CPU and compiler+OS(for ABI), this is because of I can not find exact reference to gcc align requirements for x86/amd64. – fghj May 18 '15 at 11:18
  • @user1034749 Well, you are actually asking two questions: 1) is the code is undefined behavior and 2) which alignment requirement is there for your particular ABI. I answered 1), I don't know enough about 2) to confidently answer. – Lundin May 18 '15 at 11:24
  • I disagree. The question only one - (1). I reread you answer, and I think it is wrong. You mention CPU, and yes x86 and amd64 allow misaligned access. But in standard no note about CPU, what if CPU allow, but compiler not and it (compiler) generate code that require alignment greater that CPU require alignment. In c99 standard no note how "align requirements" should be implemented, so this is will be valid implementation. – fghj May 18 '15 at 11:35
  • @user1034749 The standard is only concerned with the "implementation", which is everything from compiler, to ABI, to CPU. It seems strange that an ABI would not allow misaligned access if the CPU supports it though. – Lundin May 18 '15 at 11:43
  • For example on intel misaligned access slow down program, also it require additional code generation and disallow vectorization (on compile stage) for some code, so such ABI have many advantages, but it is just fantasy. I don't know which ABI gcc use on linux/x86_64 and linux/i686. – fghj May 18 '15 at 11:50
  • Correction, I know that on linux/x86_64 gcc uses System V AMD64 ABI, link in my question, and on page 12, there is note that 4byte int has 4 byte align, but doest that mean that for linux/x86_64/gcc code above is UB is not clear, plus there is question about x86. – fghj May 18 '15 at 11:59
1

Both x86 and x64 instruction set allow unaligned access without causing undefined behavior with the exception of their SIMD extension instruction set.

The problem with your code in x86/x64 is it invokes undefined behavior because of aliasing violation. As you are already using gcc extensions you can disabling aliasing rules with -fno-strict-aliasing.

ouah
  • 142,963
  • 15
  • 272
  • 331
  • Thanks, but "-fno-strict-aliasing" not helps, looks like gcc bug? – fghj May 18 '15 at 12:05
  • @user1034749 what do you means by "not helps", where do you think there is a gcc bug? – ouah May 18 '15 at 12:06
  • I have more complex "void f(void *p)" function in real code, and for it gcc generate "sse/avx" stuff, that require 4byte align, and real code crashes even if I use "-fno-strict-aliasing" – fghj May 18 '15 at 12:08
  • @user1034749 I think you should ask a question with the real test case that shows the crashing behavior. SIMD requirements can be way higher than just 4-byte. – ouah May 18 '15 at 17:34
  • Of course they higher, but I do NOT use it (simd) explicit, also I do NOT use intrinsics or some library that use it, it just normal c code, but in generated assembler there is simd instructions without care about alignment, and if x64 ABI do not require 4 byte alignment, this is problem of gcc to handle such misaligned case. – fghj May 18 '15 at 17:41
  • @user1034749 could you ask a new question with a minimal test case that shows your issue? – ouah May 18 '15 at 17:44
  • gcc guys think, that this is UC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66194, because of alignment, so I not sure who is right here. – fghj May 18 '15 at 18:07
  • @user1034749 some SSE2 have unaligned and aligned versions of the same instruction (e.g., MOVAPD, MOVAPS, MOVDQA for the aligned ones). Intel spec also says that using the aligned instructions with unaligned memory (16-bytes alignment) is disallowed and a general-protection exception (GP#) is generated. The program you pasted in gcc PR is written as if alignment was allowed which it is not the case when SIMD instructions are enabled. – ouah May 18 '15 at 19:20
  • It looks like complete mess for me. There is C code, there is platform gcc/linux/amd64, and there is restrictions for this platform. How that restriction can depend on optimization level of compiler for this platform? They should not change at all. – fghj May 18 '15 at 19:25
  • Obviously, if I wrote bad code (UB/UC), then code may crash depend on optimization level, but in my opinion, requirements as they are not changed for platfrom (CPU/OS/C compiler) uses compiler SIMD or not. If compiler decide to use SIMD, and memory alignment restriction that it uses bad for SIMD, then it should handle part of code using ordinary instructions and part of data that aligned properly with SIMD, or use SIMD instruction that allow misalignment. – fghj May 18 '15 at 19:30
  • @user1034749 *s/as if alignment/as if overlooking alignment requirements/* your program unfortunately exhibits undefined behavior as per C; as a general rule I personally always avoid to cast a pointer to another pointer type different than a character type pointer. – ouah May 18 '15 at 19:35
  • @user1034749 but don't get me wrong, I'm also very surprised gcc allows itself these kind of optimizations and I also may question its right to do so. I think you should unaccept my answer so more people are incited to answer. Also adding the bugzilla link to your question may help. – ouah May 18 '15 at 19:46
  • Yes, I also not write such kind of code, I get this question from ru.stackoverflow. But I saw tons of code for x86/32bit on C language, that use similar "tricks", and I'm very surprised that gcc works in so way. – fghj May 18 '15 at 19:48
  • @user1034749 gcc (and other compilers too) correctly performs the same optimizations on ARM processors that does not allow unaligned access (e.g, Cortex M0). Here the difference is it seems gcc have requirements that are greater than the hardware requirements (which your example shows). – ouah May 18 '15 at 19:53