The IDEA cipher uses multiplication modulo 2^16 + 1
. Is there an algorithm to perform this operation without general modulo operator (only modulo 2^16
(truncation))? In the context of IDEA, zero is interpreted as 2^16
(it means zero isn't an argument of our multiplication and it cannot be the result, so we can save one bit and store value 2^16
as bit pattern 0000000000000000
). I am wondering how to implement it efficiently (or whether it is possible at all) without using the standard modulo operator.
Asked
Active
Viewed 1,794 times
6

Michael Foukarakis
- 39,737
- 6
- 87
- 123

ciechowoj
- 914
- 6
- 26
-
I've added some clarification. – ciechowoj Mar 09 '15 at 10:30
2 Answers
5
You can utilize the fact, that (N-1) % N == -1.
Thus, (65536 * a) % 65537 == -a % 65537.
Also, -a % 65537 == -a + 1 (mod 65536), when 0 is interpreted as 65536
uint16_t fastmod65537(uint16_t a, uint16_t b)
{
uint32_t c;
uint16_t hi, lo;
if (a == 0)
return -b + 1;
if (b == 0)
return -a + 1;
c = (uint32_t)a * (uint32_t)b;
hi = c >> 16;
lo = c;
if (lo > hi)
return lo-hi;
return lo-hi+1;
}
The only problem here is if hi == lo
, the result would be 0. Luckily a test suite confirms, that it actually can't be...
int main()
{
uint64_t a, b;
for (a = 1; a <= 65536; a++)
for (b = 1; b <= 65536; b++)
{
uint64_t c = a*b;
uint32_t d = (c % 65537) & 65535;
uint32_t e = m(a & 65535, b & 65535);
if (d != e)
printf("a * b % 65537 != m(%d, %d) real=%d m()=%d\n",
(uint32_t)a, (uint32_t)b, d, e);
}
}
Output: none

Aki Suihkonen
- 19,144
- 1
- 36
- 57
-
ps. Running the suite in core i5 proved the % 65537 method faster. – Aki Suihkonen Mar 09 '15 at 11:42
-
I still do not understand how that 5 last lines work. I suppose it is slower because of that two ifs or compiler is doing some magic under the hood. – ciechowoj Mar 09 '15 at 12:37
-
-
1Combining the first tests to `(a*b == 0)` and wrapping the last expression to `c = lo-hi; c += c >> 16;` one gets on par with the % method on i5. A branchless version will of course outperform the div method when vectorized. – Aki Suihkonen Mar 09 '15 at 13:21
-
Can it be proven somehow from basic principles that `hi != lo`? Is it correct to do `c += c >> 16`? I think that when `lo > hi` then upper bis of `c` are all zeros, and when `lo < hi` then upper bits are ones. So `c += c >> 16` really means `c = c + 2^16 - 1` which when taken modulo `2^16` is `c - 1` and in the first version we were adding one instead of subtracting it (`return lo-hi+1;`). – ciechowoj Mar 10 '15 at 09:17
-
2Yes. hi == lo would require that a * b = 0x10001 * x . 0x10001 is a prime (actually the same prime 65537 that we are taking the modulo) and both _a_ and _b_ are smaller than that. Thus a*b can't form such number. OTOH this is not true for M=2^32+1, which is a composite and prevents augmenting IDEA to 32-bit integers. – Aki Suihkonen Mar 10 '15 at 09:27
-
1The final wrapping must equal to `lo-hi+1`, which is actually equivalent to `c -= c >> 16; // mod 65536`, not `c += xxx`, just as in the other answer. – Aki Suihkonen Mar 10 '15 at 09:33
4
First, the case where either a
or b
is zero. In that case, it is interpreted as having the value 2^16, therefore elementary modulo arithmetic tells us that:
result = -a - b + 1;
, because (in the context of IDEA) the multiplicative inverse of 2^16 is still 2^16, and its lowest 16 bits are all zeroes.
The general case is much easier than it seems, now that we took care of the "0" special case (2^16+1 is 0x10001):
/* This operation can overflow: */
unsigned result = (product & 0xFFFF) - (product >> 16);
/* ..so account for cases of overflow: */
result -= result >> 16;
Putting it together:
/* All types must be sufficiently wide unsigned, e.g. uint32_t: */
unsigned long long product = a * b;
if (product == 0) {
return -a - b + 1;
} else {
result = (product & 0xFFFF) - (product >> 16);
result -= result >> 16;
return result & 0xFFFF;
}

Michael Foukarakis
- 39,737
- 6
- 87
- 123