I am writing a modulo function and want to optimize the number of instructions called. Currently it looks like this
#include <cstdlib>
constexpr long long mod = 1e9 + 7;
static __attribute__((always_inline)) long long modulo(long long x) noexcept
{
return lldiv(x, mod).rem;
}
and even though I am compiling with -static
there is a call to lldiv that isn't being inlined. I want to get around this by adding inline assembly to my function to call the idiv
instruction directly. How can I do this?
I am using g++ 9.4.0 and I'm targeting x86.
Since this is related to a competetive programming contest, the code will be compiled with g++ -std=c++17 -O3 -static
.