0

I want to calculate quotient of "multiplication of two 64bit unsigned integer" divided by 2^64.

Some people says that 128bit integers are able to use, but in some programming language or some platform (like visual studio c++) doesn't support 128bit integers as built-in.

But we don't want to use division, because division takes time too much. I think it can be done with addition/subtraction, multiplication, and bitwise operation (like bit-shift).

square1001
  • 1,402
  • 11
  • 26
  • 1
    You may use `double` for multiplying and dividing by 2^64. Since the result will be <= 2^64, you can typecast the result back to `uint64_t` which has 64 bits. – kiner_shah May 26 '18 at 11:22
  • 2
    @kiner_shah No, double has only 53 bits and other 11 bits are for exponent. There's a precision issue. – square1001 May 26 '18 at 11:27
  • I tried it, it looks like its working: [Link](https://ideone.com/MvE1QO). However, you are right, `double` has 53 bits for mantissa (IEEE 754). – kiner_shah May 26 '18 at 11:30
  • 1
    @kiner_shah Actually it has some error - https://ideone.com/xG75bR – square1001 May 26 '18 at 11:35
  • "divided by 2^64": using the word division is confusing the matter, this is a shift, or actually you want the high part of the multiplication. You don't say if you only want to support visual studio (and only x64?) or if you have other constraints. – Marc Glisse May 26 '18 at 12:03
  • 2
    https://stackoverflow.com/q/28868367/1918193 for instance, which contains more links. https://learn.microsoft.com/en-us/cpp/intrinsics/umulh for the visual studio x64 version. – Marc Glisse May 26 '18 at 12:05

2 Answers2

2

Break your numbers into two pieces (using bit shifts and bit masks) and apply some algebra.

  • First number: A*2^32 + C, where A and C are each less than 2^32.
  • Second number: B*2^32 + D, where B and D are each less than 2^32.
  • (A*2^32 + C) * (B*2^32 + D) = (A*B)*2^64 + (A*D)*2^32 + (B*C)*2^32 + (C*D)
  • Divide by 2^64: (A*B) + (A*D)/2^32 + (B*C)/2^32 + (C*D)/2^64

So the answer is almost (A*B) + (A*D)>>32 + (B*C)>>32, but this could allow a round-off error. What is the error? Subtract this almost-answer from the real (floating point) quotient:

  • (A*D)&0xFFFFFFFF/2^32 + (B*C)&0xFFFFFFFF/2^32 + (C*D)/2^64 (please view the divisions as "real" or floating point).
  • = [(A*D)&0xFFFFFFFF + (B*C)&0xFFFFFFFF + (C*D)/2^32] / 2^32 (again, real division)
  • = [(A*D)&0xFFFFFFFF + (B*C)&0xFFFFFFFF + (C*D)>>32] >> 32 plus something less than 1.

So you can get the desired number with (A*B) + (A*D)>>32 + (B*C)>>32 + [(A*D)&0xFFFFFFFF + (B*C)&0xFFFFFFFF + (C*D)>>32] >> 32

JaMiT
  • 14,422
  • 4
  • 15
  • 31
0

Decompose your numbers a and b into 32 bits parts :

a = a1 * 2**32 + a0
b = b1 * 2**32 + b0
a * b = (a1 * b1) * 2**64 + (a1 * b0 + a0 * b1) * 2**32 + a0 * b0

To obtain the 64 high bits of the result, the a1 * b1 part is obvious, the other part should be decomposed : first add the 32 high bits of a1 * b0 to the 32 high bits of a0 * b1; second add the 32 low bits of (a1 * b0 + a0 * b1) to the 32 high bits of a0 * b0 and retain the 32 high bits of this intermediate result (actually 1 significant bit) in order to take into account overflow from low bits before discarding them.

Below is the code with a basic check of the result.

#include <cstdint>
#include <iostream>

using namespace std;

inline uint64_t high32(uint64_t x) {
    return x >> 32;
}

inline uint64_t low32(uint64_t x) {
    return static_cast<uint32_t>(x);
}

uint64_t mul64(uint64_t a, uint64_t b)
{
    uint64_t a1 = high32(a);
    uint64_t a0 = low32(a);
    uint64_t b1 = high32(b);
    uint64_t b0 = low32(b);

    uint64_t a1_b0 = a1 * b0;
    uint64_t a0_b1 = a0 * b1;

    return a1 * b1 + high32(a1_b0) + high32(a0_b1)
         + high32(low32(a1_b0 + a0_b1) + high32(a0 * b0));
}

int main()
{
    cout << mul64(UINT64_MAX, UINT64_MAX) << endl;
    cout << UINT64_MAX - 1 << endl;

    cout << endl;
    cout << mul64(UINT64_MAX - 1, UINT64_MAX) << endl;
    cout << UINT64_MAX - 2 << endl;

    cout << endl;
    cout << mul64(UINT64_MAX - 2, UINT64_MAX) << endl;
    cout << UINT64_MAX - 3 << endl;
}
Jérôme Migné
  • 244
  • 1
  • 5