To make a "clean" (portable) solution, do multiplication of 32-bit halves.
Using mathematical notation:
x = x1 * 2^32 + x0
y = y1 * 2^32 + y0
then
x * y = x1 * y1 * 2^64 + (x1 * y0 + x0 * y1) * 2^32 + x0 * y0
Here, all xn * yn fit into 64 bits. The only tricky part is doing the additions without losing the overflow (carry) bits.
For that, you can use the following code, which checks whether an overflow occurs when adding two 64-bit numbers.
bool overflow(uint64_t x, uint64_t y)
{
return x + y < x;
}
Here is code which calculates the high 64-bit part of 128-bit multiplication:
uint64_t doit(uint64_t x, uint64_t y)
{
// Calculate 64-bit parts of the 128-bit result
uint64_t x1 = x >> 32;
uint64_t x0 = x << 32 >> 32;
uint64_t y1 = y >> 32;
uint64_t y0 = y << 32 >> 32;
uint64_t part0 = x0 * y0;
uint64_t part1 = x1 * y0;
uint64_t part2 = x0 * y1;
uint64_t part3 = x1 * y1;
/// Use part3
uint64_t result = part3;
// Use the 32-bit high halves of part1 and part2
result += part1 >> 32;
result += part2 >> 32;
// Throw away their high half; multiply by 2^32
part1 <<= 32;
part2 <<= 32;
// Calculate the 65-bit sum of parts 1 and 2
bool carry = overflow(part1, part2)
result += carry;
uint64_t temp = part1 + part2;
// Use part0
carry = overflow(temp, part0)
result += carry;
return result;
}