Modulo multiplication (in C)

Question

Specifically: I have two unsigned integers (a,b) and I want to calculate (a*b)%UINT_MAX (UINT_MAX is defined as the maximal unsigned int). What is the best way to do so?

Background: I need to write a module for linux that will emulate a geometric sequence, reading from it will give me the next element (modulo UINT_MAX), the only solution I found is to add the current element to itself times, while adding is done using the following logic:(that I use for the arithmetic sequence)

for(int i=0; i<b; ++i){
  if(UINT_MAX - current_value > difference) {
    current_value += difference;
  } else {
    current_value = difference - (UINT_MAX - current_value);
  }

when current_value = a in the first iteration (and is updated in every iteration, and difference = a (always). Obviously this is not a intelligent solution. How would an intelligent person achieve this?

Thanks!

are you not allowed to use the modulus operator or 8 byte integer types? — davogotland, Jan 14 '12 at 14:33
The very simple stupid solution for where "long long" is a longer type than int. long long result = ((long long)a) * ((long long)b) % ((long long)UINT_MAX); — Joachim Isaksson, Jan 14 '12 at 14:43
@JoachimIsaksson result shouldn't have to be of type long long then, right? — davogotland, Jan 14 '12 at 14:55
You probably can use the Chinese Remainder Theorem. See these questions [Restore a number from several its remainders](http://stackoverflow.com/questions/5286898/restore-a-number-from-several-its-remainders-chinese-remainder-theorem) and [Sum and multiplication modulo](http://stackoverflow.com/questions/6008312/sum-and-multiplication-modulo) — ypercubeᵀᴹ, Jan 14 '12 at 14:59
I'm looking for a solution that will also work for the "largest" datatype in the system, so using "long long" is not it. @ypercube CRT is hard to use in this case since UINT_MAX=(2^n)-1, isn't it? — Gilgr, Jan 14 '12 at 15:37
Doing modulo 2^64-1 arithmetic calculations in 64 bits might be easier than modulo 2^32-1 in 32 bits: `2^64-1 = 3 x 5 x 17 x 257 x 641 x 65537 x 6700417` — ypercubeᵀᴹ, Jan 14 '12 at 15:55
Highest prime factor has 23 bits, less than half of 64. Highest factor of `2^32-1` has 17 bits, just one more than half of 32, and this complicates things. — ypercubeᵀᴹ, Jan 14 '12 at 15:59
See also this: [Efficient VLSI Implementation of Modulo 2^n+-1 Addition and Multiplication](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.71.1710&rep=rep1&type=pdf) — ypercubeᵀᴹ, Jan 14 '12 at 16:09
If `a=2^n` then `XY mod(a-1) = ((XY mod a) + (XY div a)) mod (a-1)` — ypercubeᵀᴹ, Jan 14 '12 at 16:12

Daniel Fischer · Accepted Answer · 2012-01-14T15:40:20.740

2

As has been mentioned, if you have a type of twice the width available, just use that, here

(unsigned int)(((unsigned long long)a * b) % UINT_MAX)

if int is 32 bits and long long 64 (or more). If you have no larger type, you can split the factors at half the bit-width, multiply and reduce the parts, finally assemble it. Illustrated for 32-bit unsigned here:

a_low = a & 0xFFFF;  // low 16 bits of a
a_high = a >> 16;    // high 16 bits of a, shifted in low half
b_low = b & 0xFFFF;
b_high = b >> 16;
/*
 * Now a = (a_high * 65536 + a_low), b = (b_high * 65536 + b_low)
 * Thus a*b = (a_high * b_high) * 65536 * 65536
 *          + (a_high * b_low + a_low * b_high) * 65536
 *          + a_low * b_low
 *
 * All products a_i * b_j are at most (65536 - 1) * (65536 - 1) = UINT_MAX - 2 * 65536 + 2
 * The high product reduces to
 * (a_high * b_high) * (UINT_MAX + 1) = (a_high * b_high)
 * The middle products are a bit trickier, but splitting again solves:
 * m1 = a_high * b_low;
 * m1_low = m1 & 0xFFFF;
 * m1_high = m1 >> 16;
 * Then m1 * 65536 = m1_high * (UINT_MAX + 1) + m1_low * 65536 = m1_high + m1_low * 65536
 * Similar for a_low * b_high
 * Finally, add the parts and take care of overflow
 */
m1 = a_high * b_low;
m2 = a_low * b_high;
m1_low = m1 & 0xFFFF;
m1_high = m1 >> 16;
m2_low = m2 & 0xFFFF;
m2_high = m2 >> 16;
result = a_high * b_high;
temp = result + ((m1_low << 16) | m1_high);
if (temp < result)    // overflow
{
    result = temp+1;
}
else
{
    result = temp;
}
if (result == UINT_MAX)
{
    result = 0;
}
// I'm too lazy to type out the rest, you get the gist, I suppose.

Of course, if what you need is actually reduction modulo UINT_MAX + 1, as @Toad assumes,then that's just what multiplication of unsigned int does.

edited Jan 14 '12 at 15:40

answered Jan 14 '12 at 15:16

Daniel Fischer

181,706
17
308
431

2

Reduction of unsigned long long mod MAX_INT can be simplified by applying the equation (a*N+b)%(N-1) = (a + b)%(N-1). – Raymond Chen Jan 14 '12 at 15:27
True, but if `a+b` overflows, you have to adjust. And if a larger type is available, up- and down-casting solves the problem without splitting the product in high and low bits, so that's conceptually simpler. – Daniel Fischer Jan 14 '12 at 15:48
Agreed that it is conceptually simpler, but the trick avoids an expensive double-word division – Raymond Chen Jan 14 '12 at 18:33
Indeed. Although I'm not sure the division would still be slower than shift, add and overflow check on today's processors. But since you know much more about such things than I do, I'll take your word for it. – Daniel Fischer Jan 14 '12 at 18:59
No inside knowledge here. But most 32-bit processors do not have a native "64 divided divided by 32 yielding 64" instruction. (They may have 64 div 32 to 32.) The 64 div 32 to 64 case is usually implemented in software. – Raymond Chen Jan 14 '12 at 19:44
But hopefully by engineers with better knowledge of algorithms and the specific hardware than Joe Random User. Since it's not a too exotic instruction, I would expect it to be optimised rather well. So I wouldn't bet offhand that one could beat it by exploiting the special structure of the problem. But nor would I bet against it. – Daniel Fischer Jan 14 '12 at 20:28
This solution will not work if the compiler is not C99 compliant, or if it (partially) is but "long long" happens to have the width as "unsigned int". – SquareRootOfTwentyThree Sep 16 '12 at 08:43
@SquareRootOfTwentyThree For the short and quick way with the cast, I specifically said "if". The long and tedious way works - with minor modifications - for all widths of `unsigned int` (if the width is odd, the modification would have to be a bit less minor). It uses only properties of unsigned integers guaranteed since C89. So can you elaborate in what way "This solution will not work..."? – Daniel Fischer Sep 16 '12 at 13:28

Toad · Answer 2 · 2012-01-14T15:45:11.317

EDIT: As pointed out in the comments... this answer applies to Modulo MAX_INT+1 I'll leave it standing here, for future reference.

It's much simpeler than that:

Just multiply the two unsigned ints, the result will also be an unsigned int. Everything which didn't fit in the unsigned int is basically not there. So no need to do a modulo operation:

See example here

 #include <stdio.h>

 void main()
 {
     unsigned int a,b;
     a = 0x90000000;
     b = 2;

     unsigned int c = a*b;

     printf("Answer is %X\r\n", c);
 }

answer is: 0x20000000 (so it should have been 0x120000000, but the answer has been truncated, just what you wanted with the modulo operation)

Looks like you misread the question along with the first answer. (which is now deleted) The OP wants modulo `UINT_MAX`, not `UINT_MAX + 1`. — Mysticial, Jan 14 '12 at 15:36

Modulo multiplication (in C)

2 Answers2