88

I know Python // rounds towards negative infinity and in C++ / is truncating, rounding towards 0.

And here's what I know so far:

               |remainder|
-12 / 10  = -1,   - 2      // C++
-12 // 10 = -2,   + 8      # Python

12 / -10  = -1,     2      // C++
12 // -10 = -2,   - 8      # Python

12 / 10  = 1,      2       // Both
12 // 10 = 1,      2

-12 / -10 = 1,    - 2      // Both
          = 2,    + 8

C++:
1. m%(-n) == m%n
2. -m%n == -(m%n)
3. (m/n)*n + m%n == m

Python:
1. m%(-n) == -8 == -(-m%n)
2. (m//n)*n + m%n == m

But why Python // choose to round towards negative infinity? I didn't find any resources explain that, but only find and hear people say vaguely: "for mathematics reasons".

For example, in Why is -1/2 evaluated to 0 in C++, but -1 in Python?:

People dealing with these things in the abstract tend to feel that rounding toward negative infinity makes more sense (that means it's compatible with the modulo function as defined in mathematics, rather than % having a somewhat funny meaning).

But I don't see C++ 's / not being compatible with the modulo function. In C++, (m/n)*n + m%n == m also applies.

So what's the (mathematical) reason behind Python choosing rounding towards negative infinity?


See also Guido van Rossum's old blog post on the topic.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Rick
  • 7,007
  • 2
  • 49
  • 79
  • 16
    Note that the `%` operator is not *quite* the same in C++ and Python: in C++, it is a **remainder** operator but, in Python, it is a **modulus** operator. [Nice answer explaining the difference](https://stackoverflow.com/a/65604804/10871073). – Adrian Mole Jan 16 '22 at 14:16
  • 1
    While the rule `(m/n)*n + m%n == m` applies, the possible outputs of `m%n` is `[-n+1,n-1]` and it is twice as big as it should be `[0,n-1]`. And it is very inconvenient for multiple purposes. Instead they chose the sign invariance... it has its perks too. In general, the problem is that people would want several properties from the rounded division but not all are achievable at the same time - so they pick what they prefer. – ALX23z Jan 16 '22 at 15:24
  • @NotThatGuy mate, forget about my format, `-2` is the quotient and `8` is the remainder here. – Rick Jan 17 '22 at 02:12
  • 9
    [Here's Guido van Rossum's old blog post on the topic.](http://python-history.blogspot.com/2010/08/why-pythons-integer-division-floors.html) – user2357112 Jan 17 '22 at 02:20
  • This is actually the same question as this one: https://stackoverflow.com/q/24848789/13270392 – GACy20 Jan 17 '22 at 09:32
  • Interesting question, but can you format the question in a more readable way with standard notation conventions? `-12 / 10 = -1 - 2` `-12 // 10 = -2 + 8` is confusing – Basj Jan 17 '22 at 14:56
  • Statistics fellow dropping by to point out that "rounding" is not the same as "floor" or "ceiling,", and I get the feeling that most of the operators in question here are truncating? The **R** language *as.integer* truncates regardless of sign; MATLAB's *cast* function rounds away from zero; **R** *round* rounds to nearest **even** value, as expected for a statistics - oriented language. – Carl Witthoft Jan 17 '22 at 15:07
  • 3
    @user438383: I fixed your edit to the title which said rounding *to* negative infinity instead of *toward*. That would be a question about some inputs that produced an actual -Inf as a result from something, but that's not the case here. – Peter Cordes Jan 17 '22 at 16:09
  • See [the division algorithm](https://en.wikipedia.org/wiki/Euclidean_division). "All remainders are nonnegative" is equivalent to "quotients round toward negative infinity." – Eric Towers Jan 17 '22 at 16:10
  • @EricTowers Well, "xx" is equivalent to "xx" doesn't explain anything. I don't want to know what else alogorithm and I don't think I have to. Guido's blog post and Ilmari Karonen's answer, plus some elementary school math are sufficient to understand the question clearly. – Rick Jan 17 '22 at 16:19
  • 1
    False. You ask for clarification regarding 'I didn't find any resources explain that, but only find and hear people say vaguely: "for mathematics reasons".' The clarification is: that is how division is defined to work. If you choose to not understand division, then you choose to not understand the answer to your question. – Eric Towers Jan 17 '22 at 16:21
  • "in C++ / is truncating, rounding towards 0." It's worth noting that while that is true in modern versions of C and C++, in C89 the rounding behavior of division on negative numbers was implementation defined. – plugwash Jan 17 '22 at 18:08
  • @plugwash: Is that worth noting? Hmm, turns out Guido van Rossum began working on Python in the late 1980s, according to wikipedia. I was thinking it was later, but you're right, C at the time allowed that choice. Truncating integer division was at least a de-facto standard by the time C99 standardized it (which is I assume *why* they were able to standardize it without excluding any platforms anyone cared about, at most forcing compiler changes on CPUs that can do it both ways or don't have HW int division.) HW division on then-mainstream ISAs (like x86, MIPS, SPARC) is truncating. – Peter Cordes Jan 18 '22 at 03:47
  • 3
    This is even [explained in the Python FAQ](https://docs.python.org/3/faq/programming.html#why-does-22-10-return-3). – user3840170 Jan 18 '22 at 17:10
  • @AdrianMole *"Note that the % operator is not quite the same in C++ and Python: in C++, it is a remainder operator but, in Python, it is a modulus operator."* I don't think this vocabulary distinction is standard. In fact, if you showed the behaviour of C++ and python's operator % to a mathematician and asked them which one to call modulus and which one to call remainder, they would probably call python's "remainder" and C++'s "modulus" rather than the other way around. – Stef Jan 18 '22 at 22:38
  • 1
    @Stef Well, the C++ *Standard* (sort of) defines the `%` operator as remainder: **8.5.5.4:** *and the binary % operator yields the remainder from the division of the first expression by the second.* But, then again, so does [this Python reference](https://docs.python.org/3/reference/expressions.html#operator-precedence). Nor sure what or where the *official* Python Standard is. – Adrian Mole Jan 18 '22 at 22:49
  • ... sort of confirmed by [this *official* reference](https://docs.python.org/3/reference/expressions.html): *The % (modulo) operator yields the remainder from the division of the first argument by the second.* – Adrian Mole Jan 18 '22 at 23:08
  • Then there's this: *The **floor division** and modulo operators are connected by the following identity: x == (x//y)*y + (x%y).* – Adrian Mole Jan 18 '22 at 23:12
  • Q: "What's the mathematical reason behind Python choosing to round integer division toward negative infinity?" A: It does not. The question is an example of the [Bulverism fallacy](https://rationalwiki.org/wiki/Bulverism). Specifically, Python does not have a concept of a division operator which is overloaded to behave differently with floats compared to integers. The '//' operator is a 'floor division' operator, and its result is consistent whether called with ints or floats (with the output given a compatible type to its inputs). – Tasos Papastylianou Feb 18 '22 at 16:32

8 Answers8

97

But why Python // choose to round towards negative infinity?

I'm not sure if the reason why this choice was originally made is documented anywhere (although, for all I know, it could be explained in great length in some PEP somewhere), but we can certainly come up with various reasons why it makes sense.

One reason is simply that rounding towards negative (or positive!) infinity means that all numbers get rounded the same way, whereas rounding towards zero makes zero special. The mathematical way of saying this is that rounding down towards −∞ is translation invariant, i.e. it satisfies the equation:

round_down(x + k) == round_down(x) + k

for all real numbers x and all integers k. Rounding towards zero does not, since, for example:

round_to_zero(0.5 - 1) != round_to_zero(0.5) - 1

Of course, other arguments exist too, such as the argument you quote based on compatibility with (how we would like) the % operator (to behave) — more on that below.

Indeed, I would say the real question here is why Python's int() function is not defined to round floating point arguments towards negative infinity, so that m // n would equal int(m / n). (I suspect "historical reasons".) Then again, it's not that big of a deal, since Python does at least have math.floor() that does satisfy m // n == math.floor(m / n).


But I don't see C++ 's / not being compatible with the modulo function. In C++, (m/n)*n + m%n == m also applies.

True, but retaining that identity while having / round towards zero requires defining % in an awkward way for negative numbers. In particular, we lose both of the following useful mathematical properties of Python's %:

  1. 0 <= m % n < n for all m and all positive n; and
  2. (m + k * n) % n == m % n for all integers m, n and k.

These properties are useful because one of the main uses of % is "wrapping around" a number m to a limited range of length n.


For example, let's say we're trying to calculate directions: let's say heading is our current compass heading in degrees (counted clockwise from due north, with 0 <= heading < 360) and that we want to calculate our new heading after turning angle degrees (where angle > 0 if we turn clockwise, or angle < 0 if we turn counterclockwise). Using Python's % operator, we can calculate our new heading simply as:

heading = (heading + angle) % 360

and this will simply work in all cases.

However, if we try to to use this formula in C++, with its different rounding rules and correspondingly different % operator, we'll find that the wrap-around doesn't always work as expected! For example, if we start facing northwest (heading = 315) and turn 90° clockwise (angle = 90), we'll indeed end up facing northeast (heading = 45). But if then try to turn back 90° counterclockwise (angle = -90), with C++'s % operator we won't end up back at heading = 315 as expected, but instead at heading = -45!

To get the correct wrap-around behavior using the C++ % operator, we'll instead need to write the formula as something like:

heading = (heading + angle) % 360;
if (heading < 0) heading += 360;

or as:

heading = ((heading + angle) % 360) + 360) % 360;

(The simpler formula heading = (heading + angle + 360) % 360 will only work if we can always guarantee that heading + angle >= -360.)

This is the price you pay for having a non-translation-invariant rounding rule for division, and consequently a non-translation-invariant % operator.

wim
  • 338,267
  • 99
  • 616
  • 750
Ilmari Karonen
  • 49,047
  • 9
  • 93
  • 153
  • 1
    What a vivid example! I read your answer in detail and I agree with you. Quite the same as I was guessing, for `a/b = q, r`, the reason why Python uses this rounding method is to make `a%b` has practical meaning/can match real life examples, when `a < 0` and `b > 0`. See [Here's Guido van Rossum's old blog post on the topic.](http://python-history.blogspot.com/2010/08/why-pythons-integer-division-floors.html) , provided by @user2357112 supports Monica in the question comment. – Rick Jan 17 '22 at 08:25
  • While for `a/b=q, r`, when `a > 0` and `b < 0`, and get `r < 0` , I can't think of any real life meaningful examples matching it. I would take it as just something come along the way when trying to making `a%b= r > 0`, when `a < 0`, `b > 0`. – Rick Jan 17 '22 at 08:25
  • Also I think : "1. `0 <= m % n < abs(n)` for all `m` and `n`;" is not 100% correct. An counterexample is `12 % -10 = -8`. I think it should be "`0 <= m % n < n` for all `m` , and `n > 0`". – Rick Jan 17 '22 at 08:25
  • 1
    Python does at least have math.floor() that does satisfy m // n == math.floor(m / n). It does for small numbers, but for larger numbers it doesn't always because m / n doesn't produce an exact result, it produces a result rounded to a floating point number. – plugwash Jan 19 '22 at 13:16
  • (unoptimized) C code is mostly a one-to-one representation of CPU instructions, so I guess (I don't have any source) it's not C designers who decided how the modulo operator works, but the processor designers. So I guess (again without source) it might be faster or requires less logical gates to implement modulo operation in this way, not necessarily wanting to "define an awkward way" – yume_chan Jan 20 '22 at 09:47
16

But why Python // choose to round towards negative infinity?

According to python-history.blogspot.com Guido van Rossum elected such behavior for // because

(...)there is a good mathematical reason. The integer division operation (//) and its sibling, the modulo operation (%), go together and satisfy a nice mathematical relationship (all variables are integers):

a/b = q with remainder r

such that

b*q + r = a and 0 <= r < b

(assuming a and b are >= 0).

If you want the relationship to extend for negative a (keeping b positive), you have two choices: if you truncate q towards zero, r will become negative, so that the invariant changes to 0 <= abs(r) < otherwise, you can floor q towards negative infinity, and the invariant remains 0 <= r < b(...) In mathematical number theory, mathematicians always prefer the latter choice (...). For Python, I made the same choice because there are some interesting applications of the modulo operation where the sign of a is uninteresting. Consider taking a POSIX timestamp (seconds since the start of 1970) and turning it into the time of day. Since there are 24*3600 = 86400 seconds in a day, this calculation is simply t % 86400. But if we were to express times before 1970 using negative numbers, the "truncate towards zero" rule would give a meaningless result! Using the floor rule it all works out fine. Other applications I've thought of are computations of pixel positions in computer graphics. I'm sure there are more.

So summing it up // behavior choice is due to keeping it consistent with % behavior, latter was selected due to its usefulness in working with negative (before start of 1970) timestamps and pixels.

Daweo
  • 31,313
  • 3
  • 12
  • 25
12

Although I can't provide a formal definition of why/how the rounding modes were chosen as they were, the citation about compatibility with the % operator, which you have included, does make sense when you consider that % is not quite the same thing in C++ and Python.

In C++, it is the remainder operator, whereas, in Python, it is the modulus operator – and, when the two operands have different signs, these aren't necessarily the same thing. There are some fine explanations of the difference between these operators in the answers to: What's the difference between “mod” and “remainder”?

Now, considering this difference, the rounding (truncation) modes for integer division have to be as they are in the two languages, to ensure that the relationship you quoted, (m/n)*n + m%n == m, remains valid.

Here are two short programs that demonstrate this in action (please forgive my somewhat naïve Python code – I'm a beginner in that language):

C++:

#include <iostream>

int main()
{
    int dividend, divisor, quotient, remainder, check;
    std::cout << "Enter Dividend: ";                        // -27
    std::cin >> dividend;
    std::cout << "Enter Divisor: ";                         // 4
    std::cin >> divisor;

    quotient = dividend / divisor;
    std::cout << "Quotient = " << quotient << std::endl;    // -6
    remainder = dividend % divisor;
    std::cout << "Remainder = " << remainder << std::endl;  // -3

    check = quotient * divisor + remainder;
    std::cout << "Check = " << check << std::endl;          // -27
    return 0;
}

Python:

print("Enter Dividend: ")             # -27
dividend = int(input())
print("Enter Divisor: ")              # 4
divisor = int(input())
quotient = dividend // divisor;
print("Quotient = " + str(quotient))  # -7
modulus = dividend % divisor;
print("Modulus = " + str(modulus))    # 1
check = quotient * divisor + modulus; # -27
print("Check = " + str(check))

Note that, for the given inputs of different signs (-27 and 4), both the quotient and remainder/modulus are different between the languages but also that the restored check value is correct in both cases.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
  • "when the two operands have different signs, these aren't necessarily the same thing" --> I thought was "when the 1st operand is negative, these aren't necessarily the same thing" [Detail](https://stackoverflow.com/a/20638659/2410359). Maybe try `27 and -4` here. – chux - Reinstate Monica Jan 18 '22 at 21:20
  • @chux With 27 and -4, the C++ code gives 3 for the remainder, whereas the Python code gives -1 for the modulus. See top part of [this answer](https://stackoverflow.com/a/65604804/10871073). – Adrian Mole Jan 18 '22 at 21:46
  • ... of course, when the remainder is zero, then so is the modulus (and *vice versa*) - hence I said "not necessarily..." – Adrian Mole Jan 18 '22 at 21:52
  • 1
    Ah I see, Python mod is not [Euclidean mod](https://en.wikipedia.org/wiki/Euclidean_division#Examples). – chux - Reinstate Monica Jan 18 '22 at 22:11
12

Both whole-number and real-number arithmetic define their division operators so that both of the following equivalences hold for all values of n and d.

(n+d)/d = (n/d)+1
(-n)/d = -(n/d)

Unfortunately, integer arithmetic cannot be defined in such a way that both hold. For many purposes, the first equivalence is more useful than the second, but in most situations where code would be dividing two values, one of the following would apply:

  1. Both values are positive, in which case the second equivalence is irrelevant.

  2. The dividend is a precise integer multiple of the divisor, in which case both equivalences can hold simultaneously.

Historically, the easiest way to handle division involving negative numbers was to observe whether exactly one operand was negative, drop the signs, perform the division, and then make the result negative if exactly one operand was. This would behave as required in both of the common situations, and would be cheaper than using an approach that would uphold the first equivalence in all cases, rather than only when the divisor was an exact multiple of the dividend.

Python shouldn't be viewed as using inferior semantics, but rather deciding that semantics which would generally be superior in cases that mattered would be worth making division slightly slower even in cases where the precise semantics wouldn't matter.

supercat
  • 77,689
  • 9
  • 166
  • 211
8

"for mathematics reasons"

Consider the problem (common enough in video games) where you have an X-coordinate that can be negative, and want to translate it into a tile coordinate (let's suppose 16x16 tiles) and an offset within the tile

Python's % gives you the offset directly, and / gives you the tile directly:

>>> -35 // 16 # If we move 35 pixels left of the origin...
-3
>>> -35 % 16 # then we are 13 pixels from the left edge of a tile in row -3.
13

(And divmod gives you both at once.)

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
  • *Translation invariance* is also said by @ilmari. –  Jan 17 '22 at 16:00
  • Yes, that's the underlying mathematical concept; I'm showing another practical example of why you would care about it. (The other poster gave the example of compass headings, which are also relevant for many video games.) – Karl Knechtel Jan 17 '22 at 16:02
  • What is tile coordinate? I want to but can't understand the example. Lack some background knowledge. – Rick Jan 18 '22 at 03:59
  • 2
    @rick easy enough to solve that problem with some googling, but in effect it is: Let's say you have walked some distance from a starting point, and you want to know which street block ("tile coordinate") you're in (assuming street blocks are all equal distances apart from each other) and how far you are from the start of that block ("offset"). Now imagine walking backwards from the starting point instead (negative distance). – Tamoghna Chowdhury Jan 18 '22 at 09:52
  • 2
    Another related example: consider Unix time_t values, which are seconds since 1970-01-01. Suppose you want to convert to a day number, and a number of seconds within the day. And suppose you want this to work for dates prior to 1970-01-01. So, for example, 12:00 noon on 1969-12-29 would be -216000. In python, -216000/86400 would give you day -3 like you want and second-within-the-day 43200 like you want. – Steve Summit Jan 18 '22 at 22:37
6

Python's a // b = floor(a/b) in standard (ASCII) mathematic notation. (In Germany, Gauss' notation [x] is common for floor(x).) The floor function is very popular (often used ⇔ useful; google to see millions of examples). Firstly probably because it's simple & natural : "largest integer ≤ x". As a consequence, it enjoys many nice mathematical properties, like:

  • Translation by an integer k: floor(x + k) = floor(x) + k.
  • Euclidean division: x = y · q + r with 0 ≤ r < q := floor(x/y) for given x and y.

Any definition of the "round towards zero" function I can think of would be much more "artificial" and involve if-then's (possibly hidden in absolute value |.| or similar). I don't know of any math book that introduces a "round towards zero" function. That's already a sufficiently good reason to adopt this convention rather than the other one.

I won't be long on the "compatibility with modulo operation" argument detailed in other answers, but it must be mentioned since it's of course also a valid argument, and it's linked to the above "translation" formula. For example in trig functions, when you need the angle modulo 2 π, it's definitely this division that you will need.

Max
  • 415
  • 5
  • 12
4

Herein, I write div for the integer division operator and mod for the remainder operator.

div and mod must be defined such that, for a, b integers with nonzero b, we have

a == (a div b)*b + (a mod b)

Often you want mod results to always be between 0 (inclusive) and b (exclusive), regardless of the sign of a. For example, a could be an index into a circular buffer, and b could be its (positive) size. Then a mod b could be used as the index into the underlying memory array, even if a is negative. In fact, using a == -1 to refer to the last buffer element is quite popular.

This may be a reason why Python rounds quotients toward negative infinity. Thus, the remainder is either zero or has the sign of the divisor, regardless of the sign of the dividend. Here is a letter-based plot for fixed divisor:

   ^ y = x mod 5
 4 |   *    *    *    *    *
   |  *    *    *    *    *
   | *    *    *    *    *    *
   |*    *    *    *    *    *
 0 +----*----*----*----*----*->
       -5    0    5   10   15 x

In C/C++, things become a little more complicated because integers have limited width.

Suppose a == INT_MIN, which in two's-complement representation is some negative power of two, and b == 3. If we round quotients such that a mod b > 0, then (a div b)*b would have to be less than INT_MIN, which would constitute a signed-integer overflow. The effects would then be implementation-defined. The machine could even disrupt execution, e.g. when compiling with GCC's -ftrapv option. But the rationale for concretizing the behavior of the integer division and remainder operations was to get rid of such uncertainties.

Therefore the only choice left for C/C++ was to round quotients toward zero. Thus, the remainder, if nonzero, has the sign of the dividend.

The drawback is that the plot for fixed divisor looks less regular:

   ^ y = x mod 5
 4 |             *    *    *
   |            *    *    *
   |           *    *    *    *
   |          *    *    *    *
 0 |    *    *    *    *    *
   |   *    *
   |  *    *
   | *    *
-4 |*    *
   +----+----+----+----+----+->
       -5    0    5   10   15 x

Consequently, mod buffer-size does not handle negative index values as we would like. Programming-wise, I dislike this decision, although I can understand the rationale to fulfill a == (a div b)*b + (a mod b) even in extreme cases.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
ccorn
  • 496
  • 8
  • 10
0

The mathematical reason behind Python choosing to round integer division toward negative infinity is that it is the most mathematically consistent option. In Python, when you divide two integers, the result will always be a floating point number. This number will be rounded to the nearest integer, with positive numbers rounding up and negative numbers rounding down. This consistent rounding behavior is what leads to the rounding toward negative infinity behavior.

The mathematical reason behind Python rounding integer division toward negative infinity is that it gives more consistent results than rounding toward positive infinity. For example, consider the following two expressions:

3 / 4

-3 / 4

The first expression will result in the value 0.75, while the second expression will result in the value -0.75. This is because the first expression rounds toward positive infinity, while the second expression rounds toward negative infinity.

sananthv
  • 3
  • 4
  • Re *"Python, when you divide two integers, the result will always be a floating point number."*: Perhaps qualify that? E.g. *"[In Python 2, division of two ints produces an int. In Python 3, it produces a float.](https://stackoverflow.com/questions/1267869/how-can-i-force-division-to-be-floating-point-division-keeps-rounding-down-to-0/1267892#1267892)"* – Peter Mortensen Jan 29 '22 at 21:29