1

I was wondering if it were possible to have a function do different things depending on whether or not it is evaluated at compile time.

For example,

struct Vec4 {
    union {
        __m128 simd_data;
        float data[4];
    };

    inline constexpr Vec4 operator+(const Vec4 &v) const {
        // If at compile time:
        return {data[0] + v.data[0], data[1] + v.data[1], data[2] + v.data[2], data[3] + v.data[3]};
        // If at runtime
        return {_mm_add_ps(simd_data, v.simd_data)};
    }
};

int main() {
    constexpr Vec4 v1 = {1, 2, 3, 4};
    constexpr Vec4 v2 = {3, 1, 2, 5};

    // Runtime variable, calls the non-constexpr compiler intrinsic
    Vec4 v_a = v1 + v2;

    // Compile time constant, runs the sequential additions
    constexpr Vec4 v_b = v1 + v2;
}

Being able to do this would allow for the function to both be evaluated at compile time if necessary, as well as using the parallelized SIMD instruction at runtime, bettering the runtime performance. Without this ability, the function has to be evaluated at runtime if using the SIMD compiler intrinsic.

I found a thread referencing SIMD instructions at compile time, however, the only answer suggests using an extra boolean argument to tell the function to run at compile time. This is not possible for operator overloading and thus did not help. I also found another thread which would suggest using a technique similar to this:


template <int>
using Void = void;

template <typename F, typename A>
constexpr auto
is_a_constant_expression(F &&f, A &&a)
    -> decltype((std::forward<F>(f)(std::forward<A>(a)), std::true_type{})) { return {}; }
constexpr std::false_type is_a_constant_expression(...) { return {}; }

struct Vec4 {
    union {
        __m128 simd_data;
        float data[4];
    };

    inline constexpr Vec4 operator+(const Vec4 &v) const;
};

struct StaticStruct {
    static constexpr Vec4 add_vec4_constexpr(const Vec4 &v1, const Vec4 &v2) {
        return {v1.data[0] + v2.data[0],
                v1.data[1] + v2.data[1],
                v1.data[2] + v2.data[2],
                v1.data[3] + v2.data[3]};
    }
    static inline Vec4 add_vec4_runtime(const Vec4 &v1, const Vec4 &v2) {
        return {_mm_add_ps(v1.simd_data, v2.simd_data)};
    }
};

#define IS_A_CONSTANT_EXPRESSION(EXPR)      \
    is_a_constant_expression(               \
        [](auto ty) -> Void<(decltype(ty):: \
                                 EXPR,      \
                             0)> {},        \
        StaticStruct{})

#define MY_MIN(...) \
    IS_A_CONSTANT_EXPRESSION(StaticStruct ::MyMin_constexpr(__VA_ARGS__)) ? StaticStruct ::MyMin_constexpr(__VA_ARGS__) : StaticStruct ::MyMin_runtime(__VA_ARGS__)

inline constexpr Vec4 Vec4::operator+(const Vec4 &v) const {
    return IS_A_CONSTANT_EXPRESSION(StaticStruct::add_vec4_constexpr(*this, v)) ? StaticStruct::add_vec4_constexpr(*this, v) : StaticStruct::add_vec4_runtime(*this, v);
}

However, I could not understand why this should work (*for normal functions) as well as why it does not work in this case. The error I get is relevant to capturing this inside of a lambda, however I do not entirely understand why a lambda is needed in this case and am wondering if there is a better way of doing this.

  • 1
    In GCC, you can use `__builtin_constannt_p(x)` to ask the compiler if `x` has a compile-time constant value. (Useful for using inline asm with runtime variables, pure C for constant-propagation of constants). – Peter Cordes Nov 20 '19 at 14:58
  • 4
    If you can use features from C++2a already, there is [std::is_constant_evaluated()](https://en.cppreference.com/w/cpp/types/is_constant_evaluated) – perivesta Nov 20 '19 at 14:59
  • 1
    `union { __m128 simd_data; float data[4]; };` would probably lead to UB, as that type punning is disallowed in C++. – Jarod42 Nov 20 '19 at 15:21
  • @PeterCordes I would prefer a compiler agnostic solution. – Gabe Rundlett Nov 20 '19 at 15:39
  • @dave This looks great and I am extremely excited for all the features coming in C++2a, however I cannot use these features yet on all the platforms I am testing my code on. – Gabe Rundlett Nov 20 '19 at 15:39
  • @Jarod42 I am new to SIMD (which is why I came across this issue in the first place) and every example I found used this technique to be able to interface with the 4 individual floats inside the __m128 variable. How would you recommend I do this instead? – Gabe Rundlett Nov 20 '19 at 15:39
  • 1
    @Jarod42: I think all compilers that support `__m128` at all define the behaviour of union type punning. GNU C++ does (with documentation to prove it's on purpose), and MSVC even defines `__m128` in terms of a union. But Gabe: normally you'd just store to an array and access elements of that, if you want all 4 elements one at a time. Otherwise shuffle and `_mm_cvtss_f32`. The cvt intrinsic is a no-op because a scalar `float` is just a value at the bottom of an XMM reg. – Peter Cordes Nov 20 '19 at 15:40
  • @PeterCordes, are you saying to not unionize the __m128 with the array of floats? If so, how does one pass the values to an intrinsic requesting __m128s, __m256s etc. ? Thank you all for the helpful responses – Gabe Rundlett Nov 20 '19 at 15:53
  • I'm curious about your use case for this… I thought about doing something like this in [SIMDe](https://github.com/nemequ/simde/) a while back, but it would be a lot of work and I didn't think it was likely to have much of an impact in practice so I decided against it. That said, I'd be open to changing my mind if people think it would really help. – nemequ Nov 27 '19 at 19:50
  • @nemequ, my use case is to have the ability to use Vector addition in the constexpr context as well as use simd operations at runtime – Gabe Rundlett Nov 28 '19 at 22:44
  • @GabeRundlett, yes, but *why*? Have you seen code in the wild that would benefit from this? SIMD tends to be used to process data fed to a program, I don't remember ever really running into a situation where it could be computed at compile-time. SO comments probably aren't the best place for this… I''ve opened up [an issue](https://github.com/nemequ/simde/issues/48) where we could discuss it in a bit more detail. – nemequ Nov 30 '19 at 21:21

0 Answers0