3

I know that modulo of power of 2 can be calculated using bitwise operator

  x % 2^n == x & (2^n - 1).

But I am wondering is there any generalized bitwise algorithm exists to find the modulus of any number is not a power of 2. For example,

 7%5 

Thank you in advance.

aliceangel
  • 304
  • 1
  • 5
  • 19
  • 3
    Possible duplicate of [Is there any way to write "mod 31" without modulus/division operators?](https://stackoverflow.com/questions/26047196/is-there-any-way-to-write-mod-31-without-modulus-division-operators) – Yunnosch Apr 20 '18 at 20:59
  • It was important when uP or uC did not support hardware division. Now the question is a bit obsolete – 0___________ Apr 20 '18 at 21:00
  • 2
    @Yunnosch that algorithm is slower than any div instruction in the modern uP – 0___________ Apr 20 '18 at 21:01
  • 3
    With the years of hardware and compiler optimizations that have gone into optimizing division and in turn mod, why is the provided `'%'` operator not sufficient? Is there some other problem you are trying to solve? – David C. Rankin Apr 20 '18 at 21:02
  • 1
    Interestingly, gcc 7.3.0 on x86_64 seems to compile `int f(int n) { return n % 5; }` in terms of multiplication by `0x66666667` (which is close to 2^32 * (2/5) ), taking the high 32-bits of the 64-bit result, some fix up, then multiplying by 5 (using `lea(%rax, %rax, 4), %eax`) and subtracting. – Daniel Schepler Apr 20 '18 at 21:10
  • @PeterJ_01 What is your point? That the answer is not a useful method outside of the requirements laid down by the question? Does that make the answer wrong? Does it make the question not being a duplicate of this one? – Yunnosch Apr 20 '18 at 21:20
  • @DanielSchepler A variant of this approach, which has been in use for at least 50 years: https://stackoverflow.com/questions/5558492/divide-by-10-using-bit-shifts – Davislor Apr 21 '18 at 00:21
  • @Yunnosch: Re “Well yes, it is the one you show in your code”: `%` is not a bitwise operator. – Eric Postpischil Apr 21 '18 at 02:48
  • Don’t worry about it. Modern compilers don’t divide in cases like yours: https://godbolt.org/g/jDHgJU – Soonts Apr 21 '18 at 08:17
  • there is a [code](http://rextester.com/DQXBT96172) i wrote long ago about a challenge regarding divisibilityby 2/3/5 using bitwise twists, also see [this answer od mine](https://stackoverflow.com/questions/3421609/check-if-a-number-is-divisible-by-3/30313043#30313043) – Abr001am Apr 29 '18 at 18:01
  • [Write a program to find remainder of dividing two numbers, without using % operator? In Java](https://stackoverflow.com/q/21369414/995714), [Getting a remainder without the modulo (%) operator in Javascript, accounting for -/+ sign](https://stackoverflow.com/q/42700673/995714), [Division without using '/'](https://stackoverflow.com/q/5386377/995714)... – phuclv May 02 '18 at 15:08

2 Answers2

6

No, there is no generalized approach for finding division remainders without actually doing division.

Powers of two are an exception because of binary representation, which lets you divide by two using shifts. The same principle is in play as the one that lets you divide decimal numbers by powers of ten simply by dropping digits off the end.

Obviously, nothing stops you from coding up division using bit operations. You would need to code subtraction, too, because the algorithm requires it as a "primitive operation". As you can imagine, this is going to be very slow.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • `r = a % 2` is the same as `r = a & 0x1` –  Apr 20 '18 at 21:03
  • 3
    @GRC And 2 is a power of 2 (2 to the power of 1). So what is your point? – Yunnosch Apr 20 '18 at 21:04
  • My point is that you do not have to divide by 2 to get get reminder, or shift or do anything. –  Apr 20 '18 at 21:05
  • 2
    @GRC So your comment basically means "I totally agree with this answer in every detail." ? – Yunnosch Apr 20 '18 at 21:07
  • 1
    The key word is **generalized**. For a specific value, any bit twiddling alogrithm must be crafted. – Weather Vane Apr 20 '18 at 21:08
  • @Yunnosch yes and no!!! Yes in a way that says you cannot do it, but no in a way that `Powers of two are an exception because of binary representation, which lets you divide by two using shifts.` This is true but it is not really a good explanation. –  Apr 20 '18 at 21:16
  • @GRC I see and I am looking forward for your better explanation in the answer you surely are in the process of writing. – Yunnosch Apr 20 '18 at 21:17
  • LOL, I am not trying to slam his answer down :( For gods sake, just wanto to improve it. I already did edit the answer, but it has to be approved. –  Apr 20 '18 at 21:23
  • 2
    @GRC `r = a % 2` is **not** the same as `r = a & 0x1` when `a` is an `int` or any _signed_ type. – chux - Reinstate Monica Apr 20 '18 at 21:31
  • @chux you are right this only works for values greater than `0`, I could of been more precise, but what my point is no need for division or shift to find reminder. –  Apr 20 '18 at 21:38
  • @chux: you are right of course! I keep forgetting about this unnerving consequence of round to zero division. – chqrlie Apr 29 '18 at 18:31
6

There are a couple, for special cases, including 5.

Since 16 ≡ 1 (mod 5), a trick you could do is split your variable into 4-bit nibbles, look up the modulus of each nibble in a table, and add the values together to get the modulus of the original number.

This program uses bitfields, table lookups, and addition. It would also work for modulo 3 or 15 and could be extended to larger chunks with a bigger lookup table.

#include <assert.h>
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>

typedef struct bitfield64_t {
  uint64_t b0 : 4;
  uint64_t b1 : 4;
  uint64_t b2 : 4;
  uint64_t b3 : 4;
  uint64_t b4 : 4;
  uint64_t b5 : 4;
  uint64_t b6 : 4;
  uint64_t b7 : 4;
  uint64_t b8 : 4;
  uint64_t b9 : 4;
  uint64_t b10 : 4;
  uint64_t b11 : 4;
  uint64_t b12 : 4;
  uint64_t b13 : 4;
  uint64_t b14 : 4;
  uint64_t b15 : 4;
} bitfield64_t;

typedef union pun64_t {
  uint64_t u;
  bitfield64_t b;
} pun64_t;

/* i%5 for i in [0,19].  The upper bound guarantees that nibble_mod5[a+b] is
 * valid whenever a<16 and b<5.
 */
const unsigned nibble_mod5[20] = {
  0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4
};

unsigned add_mod5( const unsigned a, const unsigned b )
/* Returns (a + b) % 5, where
 *   a < 16
 *   b < 5
 */
{
  assert(a < 16);
  assert(b < 5);
  return nibble_mod5[a + b];
}

int main( const int argc, const char* argv[] )
{
  int64_t n;

  if ( argc != 2 ) {
    fprintf( stderr,
             "Call this program with an unsigned number as its argument.\n" );
    return EXIT_FAILURE;
  }

  if ( 1 != sscanf( argv[1], "%lld", &n ) || n < 0 ) {
    fprintf( stderr,
             "The argument must be an unsigned number.\n" );
    return EXIT_FAILURE;
  }

  const pun64_t p = { .u = (uint64_t)n };
  const unsigned result =
    add_mod5( p.b.b15,
    add_mod5( p.b.b14,
    add_mod5( p.b.b13,
    add_mod5( p.b.b12,
    add_mod5( p.b.b11,
    add_mod5( p.b.b10,
    add_mod5( p.b.b9,
    add_mod5( p.b.b8,
    add_mod5( p.b.b7,
    add_mod5( p.b.b6,
    add_mod5( p.b.b5,
    add_mod5( p.b.b4,
    add_mod5( p.b.b3,
    add_mod5( p.b.b2,
    add_mod5( p.b.b1,
    nibble_mod5[p.b.b0] )))))))))))))));

   printf( "%u\n", result );
   assert( result == n % 5 );
   return EXIT_SUCCESS;
}

For finding the modulus of a bignum, you can take advantage of the fact that any power of 16 is congruent to 1 modulo 5. Therefore, whether your word size w is 2⁸, 2ⁱ⁶, 2³² or 2⁶⁴, you can write your bignum as a₀w⁰ + a₁w¹ + a₂w² + ... ≅ a₀1⁰ + a₁1¹ + a₂1² + ... ≡ a₀ + a₁ + a₂ + ... (mod 5). This is also why the sum of the decimal digits of any number is congruent to the original number modulo 3 or 9: 10 ≡ 1 (mod 3).

This also works for 3, 5, 15 and 17 on bytes, factors of 255 and 257 on 16-bit words and factors of 65,535 and 65,537 on 32-bit words. If you notice the pattern, it's because b²ⁿ = (bⁿ+1)(bⁿ-1) + 1, where b = 2 and n = 2, 4, 8 or 16.

You can apply a variant of this method to any n such that your chunk size is congruent to -1 (mod n): alternate addition and subtraction. It works because a₀w⁰ + a₁w¹ + a₂w² + ... ≡ a₀(-1)⁰ + a₁(-1)¹ + a₂(-1)² + ... ≡ a₀ - a₁ + a₂ - ... (mod n), but is less useful because many such values of n are Mersenne primes. It’s similar to how you can take mod 11 of any decimal by going right to left and adding, subtracting, adding and subtracting the digits, e.g. 144 ≅ 4 - 4 + 1 ≡ 1 (mod 11). Just like with digits, you could do the same trick with five-bit chunks, since 32, like 10, is also congruent to -1 modulo 11.

Another useful special case occurs when ww² ≡ c (mod b). Then you have a₀w⁰ + a₁w¹ + a₂w² + ... ≡ a₀·1 + a₁c + a₂c + ... ≡ a₀ + c(a₁ + a₂ + ...) (mod b). This is analogous to how 10 ≡ 100 ≡ 1000 ≡ ... ≡ 4 (mod 6), so any number is congruent to its last digit plus four times the sum of its remaining digits, modulo 6. The computation can be a lookup and an addition per byte, and one multiplication by a small constant that you can do with a bit shift or two. For example, to take mod 20, you can add all but the lowest-order bytes mod 20, multiply the sum by 256 mod 20 = 16, which is just a left shift of 4, then add the final byte. This can be very convenient: not counting numbers that give remainders of 1 or 0, this works with nibbles modulo 6, 10 and 12, and with bytes modulo those values and 20, 24, 30, 34, 40, 48, 60, 68, 80, 96, 102, 120, 136, 160, 170, 192, 204 and 240.

If a number can be expressed as the product of special cases, you can solve for it using the Chinese Remainder Theorem. For example, 77 = 11×7, 32 ≡ -1 mod 11, and 8 ≡ 1 mod 7, so you could find the remainders divided by 11 and 7, which determine the remainder divided by 77. Most small prime numbers fall into one of the special cases previously discussed.

Many later RISC architectures had hardware divide but not modulus, and told programmers to compute a%b by computing a-(a/b)*b. ARM A64 is the one most in use today. If you don’t have hardware division either, check out this answer. An example of another approach when the base is a small constant is given here, and is widely-used on CISC architectures.

There is also an algorithm written by Sean Anderson in 2001 but probably discovered earlier to compute the modulus by a number one less than a power of 2. It’s similar to the technique I used above, but relies on bit shifts and could be extended to any factor of (1<<s)-1. That’s almost what you’re looking for!

Generally, your optimizing compiler should be using the most efficient method to implement % on your hardware already. In your example, any decent compiler will just fold the constants and optimize 7%5 to 2.

Davislor
  • 14,674
  • 2
  • 34
  • 49
  • unless the it will be x%y. Then there is no another way - only division BTW x86 divisions are extremely slow comparing to the ARM. – 0___________ Apr 20 '18 at 22:15
  • @GRC You can award bounties. :) – Davislor Apr 20 '18 at 22:25
  • @Davislor While linked code may be from 2001, the underlying algorithm looks like a straightforward mapping of the old technique of "casting out nines", re-jiggered for base 2**s rather than base 10 (for modulo with divisor of the form base-1). I am quite sure that this mapping into the binary world was re-discovered numerous times prior to the 2000s. There is also the related technique of "casting out elevens" which applies to modulo with a divisor of (base+1). Instead of always adding digits, one alternates between adding and subtracting. – njuffa Apr 20 '18 at 23:13
  • @njuffa Yes, that’s true, and something that I and no doubt many others noticed (although “casting out sevens” appears to be more commonly used for a different technique). You could alternate addition and subtraction to find m % n where your chunk size is congruent to -1 (mod n). – Davislor Apr 20 '18 at 23:19
  • @njuffa However, base+1 is less useful because values for common word sizes are Mersenne primes. – Davislor Apr 20 '18 at 23:25
  • @Davislor I agree that the case (base+1) is of lesser utility than (base-1). At least that has been my experience in 30 years of bit twiddling. – njuffa Apr 20 '18 at 23:38
  • Impressive analysis! This method only works for positive values and you do reject negative inputs, but large unsigned values cannot be entered such as `0xffffffffffffffff`. You could define `n` as an `uint64_t`, check for a `'-'` in the argument and use `%llu`. – chqrlie Apr 29 '18 at 07:10