20

This is related to Determine cause of segfault when using -O3? In the question, I'm catching a segfault in a particular function when compiled with -O3 using a particular version of GCC. At -O3, vectorization instructions are used (at -O2, they are not used).

I want to wrap a single function in a lower optimization level. According to Switching off optimization for a specific function in GCC 4.2.2, I can do it. However, following the various links in the question and answers, I don't find an answer for "how, exactly, to do it".

How do I mark a single function to use a different optimization level?


Related, I don't want to move this function to a separate file, and then provide a different makefile recipe for it. Doing that opens another can of worms, like applying it to GCC 4.9 only on some platforms.

Community
  • 1
  • 1
jww
  • 97,681
  • 90
  • 411
  • 885
  • 3
    Just backing up that you *really* don't want to do the separate-file option, as if you compile different units with different options you are down the gurguler. – M.M Jul 13 '15 at 01:20
  • 2
    Sounds like an [XY-problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) to me. If your code missbehaves with optimization, it very likely exhibits _undefined behaviour_. If that does not show up without optimization is actually a bad sign. Instead of trying to fiddle with optimization, you definitively should search for the cause. Unless you can prove it is due to a bug in the compiler, of course. In this case the question arises how you guarantee this bug will not show up elsewhere. Here, using a patched or newer compiler is the correct way. – too honest for this site Jul 13 '15 at 02:41
  • 1
    @Olaf - the problem shows up at `-O3`, but is not present at `-O2`. At `-O3`, GCC uses SSE instructions. The best I can tell, `vmovdqa` requires 128-bit aligned words, but the code does not guarantee it. The code guarantees the array is 64-bit aligned when this particular code path is used. The code is also clean with Clang and its Undefined Behavior sanitizer. I think I am working around a compiler bug. – jww Jul 13 '15 at 03:33
  • @jww are you talking about [this bug](http://stackoverflow.com/a/30927741/1505939) in mingw-w64? – M.M Jul 13 '15 at 03:37
  • @Matt - it may be the same, but I'm not sure. I don't have the GCC experience/expertise to tell. (I'm used to things just working with GCC. Its rare when I have issues that are not my fault). – jww Jul 13 '15 at 04:24
  • @jww well compare that MCVE to your code... if it is something similar then there are various workarounds. Or post your own MCVE on a new question. – M.M Jul 13 '15 at 05:44
  • So, if that is a compiler bug, then using a more recent compiler (4.2.2 is really pretty old) should change that. I would not bet if there are more worms in the can than in using a more recent version (it does not have to be 4.9 series necessarily). And, why not use that on all platforms? – too honest for this site Jul 13 '15 at 11:54
  • @Olaf - OpenBSD provides GCC 4.2.1. We have to use what is available. But you are right - there is undefined behavior around that code due to allowing unaligned access on x86 and x64. I'm going to propose that feature is nuked. Its not needed for x86/x64, and it violates C/C++. – jww Jul 13 '15 at 12:16
  • The C standard does not enforce a specific alignment, so if, it is a problem with the compiler. Too bad you cannot/do not want to) switch to a C11 compatible compiler, as the current standard does support custom alignment. – too honest for this site Jul 13 '15 at 12:51

3 Answers3

24

I know this question is tagged as GCC, but I was just looking into doing this portably and thought the results may come in handy for someone, so:

  • GCC has an optimize(X) function attribute
  • Clang has optnone and minsize function attributes (use __has_attribute to test for support). Since I believe 3.5 it also has #pragma clang optimize on|off.
  • Intel C/C++ compiler has #pragma intel optimization_level 0 which applies to the next function after the pragma
  • MSVC has #pragma optimize, which applies to the first function after the pragma
  • IBM XL has #pragma option_override(funcname, "opt(level,X)"). Note that 13.1.6 (at least) returns true for __has_attribute(optnone) but doesn't actually support it.
  • ARM has #pragma Onum, which can be coupled with #pragma push/pop
  • ODS has #pragma opt X (funcname)
  • Cray has #pragma _CRI [no]opt
  • TI has #pragma FUNCTION_OPTIONS(func,"…") (C) and #pragma FUNCTION_OPTIONS("…") (C++)
  • IAR has #pragma optimize=...
  • Pelles C has #pragma optimize time/size/none

So, for GCC/ICC/MSVC/clang/IAR/Pelles and TI C++, you could define a macro that you just put before the function. If you want to support XL, ODS, and TI C you could add the function name as an argument. ARM would require another macro after the function to pop the setting. For Cray AFAIK you can't restore the previous value, only turn optimization off and on.

I think the main reason for this is to disable optimizations for a buggy compiler (or a compiler which exposes bugs in your code), so a unified portable experience probably isn't critical, but hopefully this list helps someone find the right solution for their compiler.

Edit: It's also worth noting that it's relatively common to disable optimizations because code which was working before no longer does. While it's possible that there is a bug in the compiler, it's much more likely that your code was relying on undefined behavior and newer, smarter compilers can and will elide the undefined case. The right answer in situations like this is not to disable optimizations, but instead to fix your code. UBsan on clang and gcc can help a lot here; compile with -fsanitize=undefined and lots of undefined behavior will start emitting warnings at runtime. Also, try compiling with all the warning options you can enabled; for GCC that means -Wall -Wextra, for clang throw in -Weverything.

nemequ
  • 16,623
  • 1
  • 43
  • 62
15

It's described in https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#Common-Function-Attributes

You can change the level by declaring the function like this:

void some_func() __attribute__ ((optimize(1))) {
    ....
}

To force optimization level 1 for it.

viraptor
  • 33,322
  • 10
  • 107
  • 191
  • is there a portable solution? is this portable to clang? – Ruggero Turra Jul 13 '15 at 00:41
  • 2
    No, it's GCC-specific. Clang has a more heavy-handed [optnone](http://clang.llvm.org/docs/AttributeReference.html#optnone-clang-optnone) which kills all optimization. But both are just custom extensions. There's no standard for function-level optimization. – viraptor Jul 13 '15 at 01:00
  • Thanks viraptor. I was hoping for a portable solution since this is a cross platform library. The same set of sources is used on Android, BSDs, iOS, Linux, OS X, Solaris and Windows. – jww Jul 13 '15 at 01:24
  • The attrobute must be just before the function signature, not after in my cases. – nouiz Sep 13 '19 at 18:19
10

Here is how to do it with pragmas:

#pragma GCC push_options
#pragma GCC optimize ("-O2")
void xorbuf(byte *buf, const byte *mask, size_t count)
{
   ...
}
#pragma GCC pop_options

To make it portable, something like the following.

#define GCC_OPTIMIZE_AWARE (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 7)) || defined(__clang__)

#if GCC_OPTIMIZE_AWARE
# pragma GCC push_options
# pragma GCC optimize ("-O2")
#endif

It needs to be wrapped because with -Wall, older version of GCC don't understand -Wno-unknown-pragma, and they will cause a noisy compile. Older version will be encountered in the field, like GCC 4.2.1 on OpenBSD.

But according to Markus Trippelsdorf on When did 'pragma optimize' become available? from the GCC mailing list:

This is a bad idea in general, because "pragma GCC optimize" is meant as a compiler debugging aid only. It should not be used in production code.

jww
  • 97,681
  • 90
  • 411
  • 885