5

Recently, I discovered void __builtin_assume(bool) for clang, which can provide additional information about the state of the program to the compiler. This can make a huge difference, like for example:

#include <cstddef>

// compiles to about 80 instructions at -O3
unsigned sum(unsigned data[], size_t count) {
    unsigned sum = 0;
    for (size_t i = 0; i < count; ++i) {
        sum += data[i];
    }
    return sum;
}

// compiles to about 10 instructions at -O3
unsigned sum_small(unsigned data[], size_t count) {
    __builtin_assume(count <= 4);
    unsigned sum = 0;
    for (size_t i = 0; i < count; ++i) {
        sum += data[i];
    }
    return sum;
}

I am forced to use GCC at this time and I am curious whether there exists an equivalent builtin. Unfortunately I could not find __builtin_assume in the GCC documentation. Maybe there exists a builtin but it just has a different name?

If there doesn't exist an equivalent builtin, is there maybe a way to produce the same result without __builtin_assume, such as intentionally invoking undefined behavior when the condition is not true?

Ideally, I would like a macro which is always safe to call like:

#if ... // detect clang
#define MY_ASSUME(condition) __builtin_assume(condition)
#elif ... // detect GCC
#define MY_ASSUME(condition) __gcc_builtin_assume_equivalent(condition)
#else
#define MY_ASSUME(condition)
#endif

Whatever the solution is, it should also work in a constexpr function.

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96
  • 1
    Related: [How to guide GCC optimizations based on assertions without runtime cost?](https://stackoverflow.com/questions/44054078/how-to-guide-gcc-optimizations-based-on-assertions-without-runtime-cost) – Evg Aug 19 '20 at 19:51

3 Answers3

6

I've used __builtin_unreachable(), which indicates that it is undefined behavior for control flow to reach here. You can wrap it in an if to essentially write an assertion. The condition can be any invariant that is false, so in your case you would put the opposite condition.

Example:

// Basically __builtin_assume(count <= 4),
// except that !(count <= 4) is evaluated.
if ( !(count <= 4) ) {
    __builtin_unreachable();
}

You can convert this into an assertion macro like this:

// Line break for readability
#define my_assert(...) \
   { if(!(__VA_ARGS__)) __builtin_unreachable(); }

Based on the code in the question, you would use it like this :

unsigned sum_small(unsigned data[], size_t count) {
    my_assert(count <= 4); // <--- Changed here
    unsigned sum = 0;
    for (size_t i = 0; i < count; ++i) {
        sum += data[i];
    }
    return sum;
}
François Andrieux
  • 28,148
  • 6
  • 56
  • 87
  • this is insufficient to match `__builtin_assume`, because when doing this gcc will often attempt to evaluate the expression, no matter what amount of effort your pour in to tell it to assume the expression has no side effects. this should be fine if you don't perform function calls in the assertion, but may really harm performance if you have involved checks the compiler cannot manage to eliminate!! – asu Aug 09 '21 at 14:08
  • @Asu Can you please elaborate or provide an example? When I try it [here](https://godbolt.org/z/a6dT9joMK) the macro seems to be optimized out as expected and the condition is not evaluated at runtime. – François Andrieux Sep 21 '21 at 13:58
  • @FrançoisAndrieux https://godbolt.org/z/6crecvx8a in this case it works as expected, even if you make `compare` `gnu::noinline`. but add a call to an undefined `bar();` inside of `compare` and it won't get optimized away. (i believe this could be due to exceptions/longjmp that could skip the if branch) as it turns out though clang's implementation of `__builtin_assume` seems worse than i thought: in that usecase it will not use the assumption, and if you annotate `compare` as `gnu::pure` it causes a function call to be emitted... urgh. – asu Sep 22 '21 at 11:05
2

I feel like going trough undefined behavior here is completely unneeded. The very simple if check couple with abort is well-defined and gives optimizer enough food for thought:

#include <cstddef>
#include <cstdlib>

// compiles to about 10 instructions at -O3
unsigned sum_small(unsigned data[], size_t count) {
    if (count > 4)
        std::abort();
    unsigned sum = 0;
    for (size_t i = 0; i < count; ++i) {
        sum += data[i];
    }
    return sum;
}

No need to summon nasal demons when none are required.

SergeyA
  • 61,605
  • 5
  • 78
  • 137
  • 1
    [That doesn't work on clang unfortunately.](https://godbolt.org/z/sh54xc). The compiler just checks the condition and calls `abort`. Otherwise this is also a good solution. I wonder why clang treats `abort` differently though. On second look it does the same on GCC, but it optimizes the rest of the function. – Jan Schultke Aug 19 '20 at 20:41
  • @J.Schultke I believe you asked about a replacement for gcc? I feel it is not hard to incorporate that into macro and use for clang. – SergeyA Aug 19 '20 at 21:03
0

Since C++23, this is possible using the [[assume]] attribute. This works just like clang's __builtin_assume. There is also an __attribute__((__assume__(...)) that works in C and C++.

Definition of Assumption Macro

// define an ASSUME(...) function-style macro so we only need to detect compilers
// in one place

// Comment this out if you don't want assumptions to possibly evaluate.
// This may happen for implementations based on unreachable() functions.
#define DANGEROUS_BEHAVIOR_ASSUMPTIONS_ALLOWED_TO_EVALUATE 1

// preferred option: C++ standard attribute
#ifdef __has_cpp_attribute
  #if __has_cpp_attribute(assume) >= 202207L
    #define ASSUME(...) [[assume(__VA_ARGS__)]]
  #endif
#endif
// first fallback: compiler intrinsics/attributes for assumptions
#ifndef ASSUME
  #if defined(__clang__)
    #define ASSUME(...) do { __builtin_assume(__VA_ARGS__); } while(0)
  #elif defined(_MSC_VER)
    #define ASSUME(...) do { __assume(__VA_ARGS__); } while(0)
  #elif defined(__GNUC__)
    #if __GNUC__ >= 13
      #define ASSUME(...) __attribute__((__assume__(__VA_ARGS__)))
    #endif
  #endif
#endif
// second fallback: possibly evaluating uses of unreachable()
#if !defined(ASSUME) && defined(DANGEROUS_BEHAVIOR_ASSUMPTIONS_ALLOWED_TO_EVALUATE)
  #if defined(__GNUC__)
    #define ASSUME(...) do { if (!bool(__VA_ARGS__)) __builtin_unreachable(); } while(0)
  #elif __cpp_lib_unreachable >= 202202L
    #include <utility>
    #define ASSUME(...) do { if (!bool(__VA_ARGS__)) ::std::unreachable(); ) while(0)
  #endif
#endif
// last fallback: define macro as doing nothing
#ifndef ASSUME
  #define ASSUME(...)
#endif

Usage Example

unsigned sum_small(unsigned data[], size_t count) {
    ASSUME(count <= 4);

    unsigned sum = 0;
    for (size_t i = 0; i < count; ++i) {
        sum += data[i];
    }
    return sum;
}

It may take some time for all compilers to implement [[assume]], but as you can see, there are plenty fallback options. As of the time of writing, only GCC 13 supports this.

See also: C++23 Compiler Support

Jan Schultke
  • 17,446
  • 6
  • 47
  • 96