8

As quoted in "Integer division rounding with negatives in C++", in C before C99 (i.e. in C89) and in C++ before C++11 (i.e. in C++98 and C++03), for an integer division computation where either operand is negative the sign of the remainder (or equivalently, the rounding direction of the quotient) is implementation-defined.

Then comes the standard function std::div which is specified to truncate the quotient towards zero (i.e. the remainder has the same sign as the dividend (numerator)) (see for example this answer to "what is purpose of div() library function?").

Here is glibc's code for div() (source) (also quoted in "Is div function useful (stdlib.h)?"):

(Note: div_t is defined as:

typedef struct
  {
    int quot;
    int rem;
  } div_t;

-- end note.)

/* Return the `div_t' representation of NUMER over DENOM.  */
div_t
div (numer, denom)
     int numer, denom;
{
  div_t result;

  result.quot = numer / denom;
  result.rem = numer % denom;

  /* The ANSI standard says that |QUOT| <= |NUMER / DENOM|, where
     NUMER / DENOM is to be computed in infinite precision.  In
     other words, we should always truncate the quotient towards
     zero, never -infinity.  Machine division and remainer may
     work either way when one or both of NUMER or DENOM is
     negative.  If only one is negative and QUOT has been
     truncated towards -infinity, REM will have the same sign as
     DENOM and the opposite sign of NUMER; if both are negative
     and QUOT has been truncated towards -infinity, REM will be
     positive (will have the opposite sign of NUMER).  These are
     considered `wrong'.  If both are NUM and DENOM are positive,
     RESULT will always be positive.  This all boils down to: if
     NUMER >= 0, but REM < 0, we got the wrong answer.  In that
     case, to get the right answer, add 1 to QUOT and subtract
     DENOM from REM.  */

  if (numer >= 0 && result.rem < 0)
    {
      ++result.quot;
      result.rem -= denom;
    }

  return result;
}

As you can see there is a test after the big comment block, whose purpose is to "correct" the result if the built-in division truncates towards -infinity instead of towards zero.

Now the question:

Isn't there a bug in that code?

Let's first consider the example call div(42, -5). "In math" 42/-5 is exactly -8.4, so theoretically in C89 and C++03 42 / -5 could yield either -8 (truncated) or -9 (floored) depending on the implementation. Reading the code:

  • If 42 / -5 yields -8 then 42 % -5 yields 2 (as 42 == -8 * -5 + 2), so the test is (42 >= 0 && 2 < 0) which is not true and the above function returns -8 and 2, as wanted;
  • If 42 / -5 yields -9 then 42 % -5 yields -3 (as 42 == -9 * -5 + -3), so the test is (42 >= 0 && -3 < 0) which is true, so the above function returns the "corrected" -9 + 1 and -3 - -5, i.e. -8 and 2, as wanted.

But now let's consider the call div(-42, 5) (signs inverted):

  • If -42 / 5 yields -8 then -42 % 5 yields -2 (as -42 == -8 * 5 + -2), so the test is (-42 >= 0 && -2 < 0) which is not true and the above function returns -8 and -2, as wanted;
  • If -42 / 5 yields -9 then -42 % 5 yields 3 (as -42 == -9 * 5 + 3), so the test is (-42 >= 0 && 3 < 0) which... is not true! and the above function returns -9 and 3 instead of -8 and -2!

The comment in the code above first seems right when it says that the situation that needs a correction is when "REM has the opposite sign of NUMER", but then it makes the huge simplification "This all boils down to: if NUMER >= 0, but REM < 0, we got the wrong answer", which seems wrong (incomplete) to me because it omits the case "if NUMER < 0, but REM > 0" (-42 and 3 in the previous example).

I can hardly believe that such a bug would have remained unnoticed since 1992 or 1990 (apparently someone tried to "fix" it but it still seems incorrect because div(-42, 5) could return -10 and 8)... Arguably, most implementations have been truncating towards zero by default (and all are required to do so starting from C99 and C++11, so the issue is "moot" in the latest Standards 1) so the bug wouldn't manifest on them, but still... Maybe I'm missing something here?

Thank you for any insights.


1 (Edit) As for "the issue is moot in C++11 and C99 (and newer)": accordingly, in these Standards the built-in division is required to truncate towards zero, so we never need to adjust the result, but doesn't that then mean that the present implementation is more complex than needed and unnecessarily inefficient? The "big comment" is obsolete and the if test useless, so shouldn't that part just be removed entirely?

Community
  • 1
  • 1
gx_
  • 4,690
  • 24
  • 31
  • modulo operator with signed operands is implementation dependent. I think that's the main crux of this issue. – Jiminion Jul 22 '13 at 19:27
  • @Jim `div`'s purpose is precisely to encapsulate the implementation-dependent operations to give implementation-_independent_ results. @ouah Your link is the same code as the last link in my question (http://minilib-c.googlecode.com/svn-history/r2/trunk/stdlib/div.c) and as I said, for `num == -42` and `denom == 5` if you get `r.quot = -9` and `r.rem = 3`, the new "correction" `--r.quot` and `r.rem += 5` will give `r.quot == -10` and `r.rem == 8` (instead of `r.quot == -8` and `r.rem == -2`)... – gx_ Jul 22 '13 at 19:53
  • I think they saw it another way. They defined div precisely w.r.t. %, which they acknowledged was implementation dependent. (Why do modulo operations with mixed sign operators anyway? I don't think this was a heavily trod area, so there was little concern about consistency. – Jiminion Jul 22 '13 at 19:59
  • @ouah Yes, at least it tries to handle it :) the added correction looks more aimed at `-42/-5` which "could" give `9` and `3` instead of `8` and `-2` (Euclidean division) (but it's almost as far-fetched as imagining `42/5` giving `9` and `-3` instead of `8` and `2` (which is forbidden by the Standards anyway)). – gx_ Jul 22 '13 at 20:19

2 Answers2

7

As the original author of the code, I have to say: You're right. It's broken. We had no systems that behaved "the wrong way" for testing, and I probably wrote the above too late (or early...) in the day or something.

We're saved by the newer standards, and the entire thing should be cleaned out, maybe with a small #ifdef (and corrected adjustment code) for pre-C99 if necessary.

(Also I'll note that the original was not indented with GNU-style indentation :-) )

torek
  • 448,244
  • 59
  • 642
  • 775
  • Chris Torek! That's _you_! Thank you so much for your answer! Really I couldn't hope for a better source :) (As for indentation, maybe the original was closer to this [div.c](http://svnweb.freebsd.org/base/head/lib/libc/stdlib/div.c?view=co)?) (Also, in the meantime I have been doing more web-searching, and found [another "fix"](http://www.beedub.com/Sprite093/src/lib/c/stdlib/div.c) (the XOR seems legit but the adjustment wrong), and especially **[a code from 1987](https://raw.github.com/7shi/minix-tools/master/lib/c/ansi/div.c)** (!) which seems to handle both 42/-5 _and_ -42/5 correctly!) – gx_ Jul 28 '13 at 08:25
  • 2
    Yes, that's basically the original (K&R-ish, "high Bosticity" indentation :-) ... what eventually became style(9) in FreeBSD). I never saw the Sprite version, nor the Vrije Universiteit one, before. The xor seems risky with plain (signed) `int`. The `static` method is valid, but in all cases the need for a run-time test just seems messy, somehow. Even if we can't count on C99 it would be nice to have a compiler-specific #define describing integer divide behavior, since so many machines "behave the way C99 wants" in the first place. – torek Jul 28 '13 at 08:42
  • Indeed. **I have filed a [glibc bug report](http://sourceware.org/bugzilla/show_bug.cgi?id=15799).** (By the way, the Vrije Universiteit one was close, but e.g. on a machine that always floors the quotient towards -infinity, `div(-10, 5)`, which yields `{-2, 0}`, will be incorrectly "corrected" to `{-1, -5}`...) – gx_ Jul 29 '13 at 18:58
0

There's rationale, why it can't be implementation-defined and what must it be.

TLDR: remainder sign must match the division sign.

Rationale:
When recovering divided value, knowing divions result, remainder and divisor, it remainder has to be summed with divisor-division product. If remander flag is implementation-specific, than operation to sum it should be also implementation-specific, is not it? Or at least have a code, after division, which would fix remainder sign if it's incorrect.