0
static inline bool is_divisible(uint32_t n, uint64_t M) {
    return n * M <= M - 1;
}

static uint64_t M3 = UINT64_C(0xFFFFFFFFFFFFFFFF) / 3 + 1;
....
uint8_t div3 = is_divisible(17, M3);

As mention in title, this function can determine whether n is divisible by 3. The only thing I figure out is that M is same as ceil((1<<64)/d) which d is 3.

Are there anyone call explain why is_divisible work? thanks!

Steven
  • 811
  • 4
  • 23
  • This appears to be relying on integer overflow. I don't see how this is better than using a simple modulo operator. – paddy Oct 07 '20 at 01:49
  • 1
    @paddy divide operation is expensive for processor. multiply is cheaper. also `M3` can be derermine before run time. – Steven Oct 07 '20 at 01:54
  • It depends on the processor. Are you targeting a specific one? Is this "divisible" operation a known bottleneck for your program? Performing a 64-bit multiplication may also have drawbacks. Especially on processors that do not have native 64-bit support. Also, operating on 64-bit values halves the possible number of concurrent operations, which affects potential performance gains that could be made through vectorized instructions if available on your platform. – paddy Oct 07 '20 at 02:03
  • 2
    @Steven Yes, but if you change the function to `return n % 3 == 0;`, the optimizer will convert the division to a multiplication anyways, so there's no need for this code. – user3386109 Oct 07 '20 at 02:04
  • 1
    Hint: work out what `2*M3` and `3*M3` are. Then write your number as `n=3*x+y` where `y` is either `0`, `1` or `2`. Using properties of arithmetic, determine what `n*M3` could be, and when it is less than `M3-1`. – Nate Eldredge Oct 07 '20 at 02:07
  • 1
    Indeed using these kinds of optimizations usually [*hinders* the *good* compilers from generating effective code](https://stackoverflow.com/questions/10455800/malloc-in-c-same-sizeof-before-and-after). Just use `M3 % 17` and let the compiler figure it ou – Antti Haapala -- Слава Україні Oct 07 '20 at 05:05
  • @AnttiHaapala Or do you want `n%3` in this case? "determine whether n is divisible by 3" – chux - Reinstate Monica Oct 07 '20 at 05:23
  • @chux-ReinstateMonica I am not clang or Godbolt, I couldn't figure out what that code does :P – Antti Haapala -- Слава Україні Oct 07 '20 at 05:24
  • the snippet above can be found in [Faster remainders when the divisor is a constant: beating compilers and libdivide](https://lemire.me/blog/2019/02/08/faster-remainders-when-the-divisor-is-a-constant-beating-compilers-and-libdivide/). Explanation: [Efficient divisibility testing](https://web.archive.org/web/20180817163253/http://clomont.com/efficient-divisibility-testing/) – phuclv Oct 07 '20 at 07:37
  • Some examples: [Divisiblity of 5 without using % and / operator](https://stackoverflow.com/a/17121753/995714), [Fast divisibility tests (by 2,3,4,5,.., 16)?](https://stackoverflow.com/a/49264279/995714), [How to check if given number is divisible of 15 in fastest way?](https://stackoverflow.com/a/18706614/995714) – phuclv Oct 07 '20 at 07:40

1 Answers1

1

Divide n by 3 to find quotient q and remainder r, allowing us represent n as n = 3q + r, where 0 ≤ r < 3.

Intuitively, multiple 3q + r by (264−1)/3 + 1 causes the q portion to vanish because it wraps modulo 264, leaving the remainder portion to be in one of three segments of [0, 264) depending on the value of r. The comparison with M3 determines whether it is in the first segment, meaning r is zero. A proof follows.

Note that M3 is (264−1)/3 + 1.

Then nM3 = (3q + r)•((264−1)/3 + 1) = q•(264−1)+3q+r•((264−1)/3 + 1) = q•264+2q+r•((264−1)/3 + 1).

When this is evaluated with uint64_t arithmetic, the q•264 term vanishes due to wrapping modulo 264, so the computed result is 2q+r•((264−1)/3 + 1).

Suppose n is a multiple of 3. Then r = 0. Since n is a uint32_t value, q < 232, so 2q+r•((264−1)/3 + 1) = 2q < 233 < M3.

Suppose n is not a multiple of 3. Then r = 1 or r = 2. If r = 1, r•((264−1)/3 + 1) = (264−1)/3 + 1 > M3−1. And, of course, if r = 2, r•((264−1)/3 + 1) is even greater and also exceeds M3−1. However, we need to be concerned about wrapping in uint64_t arithmetic. Again, since q < 232, we have, for r = 2, 2q+r•((264−1)/3 + 1) < 2•232 + 2•((264−1)/3 + 1) = 233 + 2/3•264 − 2/3 + 2 = 264 − 1/3•264 + 233 + 4/3, which is clearly less than 264, so no wrapping occurs.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312