1

I'm trying to create a code which divides a uint64_t by another uint64_t plus it applies rounding to the result. The code should be as fast as possible and work for all inputs (e.g. I would prefer it to now have and conditionals).

My current solution looks like this:

static inline uint64_t divide_with_rounding(uint64_t n, uint64_t d)
{
    uint64_t a = n / d;
    uint64_t r = n % d;
    return a + (r >= d - (d / 2));
}

gcc optimizes the division+modulo quite nicely and as well the / 2. But I wonder if there's a shorter and nicer solution.

E.g. something like this:

static inline uint64_t divide_with_rounding(uint64_t n, uint64_t d)
{
    return (n + d / 2) / d;
}

However that one has the disadvantage that divide_with_rounding(UINT64_MAX, 1000) produces 0.

phuclv
  • 37,963
  • 15
  • 156
  • 475
Kevin Meier
  • 2,339
  • 3
  • 25
  • 52
  • Does this answer your question? [Rounding integer division (instead of truncating)](https://stackoverflow.com/questions/2422712/rounding-integer-division-instead-of-truncating) – phuclv Jul 19 '23 at 16:11
  • 1
    round to nearest: `(n + d/2)/d`, to +inf: `(n + d - 1)/d`. duplicates: [What's the right way to implement integer division-rounding-up?](https://stackoverflow.com/q/41870852/995714), [Divide integers with floor, ceil and outwards rounding modes in C++](https://stackoverflow.com/q/63436490/995714), [signed integer division with rounding in C](https://stackoverflow.com/q/60009772/995714), [Fast ceiling of an integer division in C / C++](https://stackoverflow.com/q/2745074/995714) – phuclv Jul 19 '23 at 16:12

1 Answers1

2

The expression is round(x/d) = ⌊(x + d/2)/d⌋ mathematically. From the property of floor function ⌊x + n⌋ = ⌊x⌋ + n we can prove that in case d is even the result is

\left\lfloor \frac{n + \left\lfloor \frac{d}{2}\right\rfloor }{d} \right\rfloor = \left\lfloor \frac{n - \frac{d}{2} + d}{d} \right\rfloor = \left\lfloor \frac{n - \frac{d}{2}}{d} + 1 \right\rfloor = \left\lfloor \frac{n - \frac{d}{2}}{d} \right\rfloor + 1

In case d is odd we can replace d = 2k + 1 and prove that the result is the same. Therefore you can just use

if (n >= d/2)
    return (n - d/2)/d + 1;
else
    return (n + d/2)/d;

This will avoid the situation where n + d/2 overflows

However in case d is not a compile-time constant then it might be faster to do a 128-by-64-bit division if the branch misprediction cost is high. In MSVC you can do like this

uint64_t nH = 0, nL = n, rem = 0;
nL += d/2;
nH += nL < n;                        // { nH, nL } += d/2
return _udiv128(nH, nL, d, &rem);    // { nH, nL } / d

and in compilers with __int128 type like GCC, ICC, Clang... just use it directly

__int128 N = n;
N += d/2;
return N/d;
phuclv
  • 37,963
  • 15
  • 156
  • 475