10

When researching for this question and reading the sourcecode in random.py, I started wondering whether randrange and randint really behave as "advertised". I am very much inclined to believe so, but the way I read it, randrange is essentially implemented as

start + int(random.random()*(stop-start))

(assuming integer values for start and stop), so randrange(1, 10) should return a random number between 1 and 9.

randint(start, stop) is calling randrange(start, stop+1), thereby returning a number between 1 and 10.

My question is now:

If random() were ever to return 1.0, then randint(1,10) would return 11, wouldn't it?

Community
  • 1
  • 1
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • By the way, `int(random.random() * n)` still isn't a perfect way to generate integers that are uniformly distributed in `range(n)`; there's a bias that's insignificant for small `n` but becomes significant as `n` becomes large. I've opened a Python bug for this at http://bugs.python.org/issue9025 – Mark Dickinson Jun 22 '10 at 08:45
  • @Mark Dickinson: Thanks! This is fascinating. – Tim Pietzcker Jun 22 '10 at 08:58
  • @Mark Dickinson: [This bug is fixed as of today](http://docs.python.org/dev/whatsnew/3.2.html#random). – Tim Pietzcker Feb 21 '11 at 08:11
  • true, provided that you're using Python >= 3.2. – Mark Dickinson Feb 21 '11 at 08:26

3 Answers3

27

From random.py and the docs:

"""Get the next random number in the range [0.0, 1.0)."""

The ) indicates that the interval is exclusive 1.0. That is, it will never return 1.0.

This is a general convention in mathematics, [ and ] is inclusive, while ( and ) is exclusive, and the two types of parenthesis can be mixed as (a, b] or [a, b). Have a look at wikipedia: Interval (mathematics) for a formal explanation.

aioobe
  • 413,195
  • 112
  • 811
  • 826
  • I hadn't caught that `)` (and even if I had, I wouldn't have known its meaning, so thank you very much for this insightful answer). – Tim Pietzcker Jun 14 '10 at 14:33
  • 1
    @Tim: FYI, there exist several different conventions. Another commonly used convention is inverting the square braces, so that `[a, b[` would be a half-open interval equivalent to `[a, b)`. – Konrad Rudolph Jun 14 '10 at 14:38
  • 5
    This isn't quite enough, since it's not obvious that `0.0 <= x < 1.0` implies that `0 <= x * n < n` in floating-point arithmetic: in general, the result of the multiplication won't be exactly representable, so it'll be rounded. And it could be rounded up. It's *conceivable* that this rounding could produce `n`, for `x` very close to (but not equal to) `1.0`. Fortunately it's possible to show that, assuming round-to-nearest, this can never happen. – Mark Dickinson Jun 14 '10 at 20:58
  • Interesting point. Same applies for other programming languages? – aioobe Jun 15 '10 at 05:56
  • 1
    Sure; this is nothing particularly special to Python. With IEEE 754 floating-point arithmetic in round-to-nearest mode, you can show that `x * y < y` for any `x < 1.0` and any non-tiny positive `y`. (It can fail if `y` is either subnormal or the smallest positive normal number.) – Mark Dickinson Jun 15 '10 at 10:37
12

Other answers have pointed out that the result of random() is always strictly less than 1.0; however, that's only half the story.

If you're computing randrange(n) as int(random() * n), you also need to know that for any Python float x satisfying 0.0 <= x < 1.0, and any positive integer n, it's true that 0.0 <= x * n < n, so that int(x * n) is strictly less than n.

There are two things that could go wrong here: first, when we compute x * n, n is implicitly converted to a float. For large enough n, that conversion might alter the value. But if you look at the Python source, you'll see that it only uses the int(random() * n) method for n smaller than 2**53 (here and below I'm assuming that the platform uses IEEE 754 doubles), which is the range where the conversion of n to a float is guaranteed not to lose information (because n can be represented exactly as a float).

The second thing that could go wrong is that the result of the multiplication x * n (which is now being performed as a product of floats, remember) probably won't be exactly representable, so there will be some rounding involved. If x is close enough to 1.0, it's conceivable that the rounding will round the result up to n itself.

To see that this can't happen, we only need to consider the largest possible value for x, which is (on almost all machines that Python runs on) 1 - 2**-53. So we need to show that (1 - 2**-53) * n < n for our positive integer n, since it'll always be true that random() * n <= (1 - 2**-53) * n.

Proof (Sketch) Let k be the unique integer k such that 2**(k-1) < n <= 2**k. Then the next float down from n is n - 2**(k-53). We need to show that n*(1-2**53) (i.e., the actual, unrounded, value of the product) is closer to n - 2**(k-53) than to n, so that it'll always be rounded down. But a little arithmetic shows that the distance from n*(1-2**-53) to n is 2**-53 * n, while the distance from n*(1-2**-53) to n - 2**(k-53) is (2**k - n) * 2**-53. But 2**k - n < n (because we chose k so that 2**(k-1) < n), so the product is closer to n - 2**(k-53), so it will get rounded down (assuming, that is, that the platform is doing some form of round-to-nearest).

So we're safe. Phew!


Addendum (2015-07-04): The above assumes IEEE 754 binary64 arithmetic, with round-ties-to-even rounding mode. On many machines, that assumption is fairly safe. However, on x86 machines that use the x87 FPU for floating-point (for example, various flavours of 32-bit Linux), there's a possibility of double rounding in the multiplication, and that makes it possible for random() * n to round up to n in the case where random() returns the largest possible value. The smallest such n for which this can happen is n = 2049. See the discussion at http://bugs.python.org/issue24546 for more.

Mark Dickinson
  • 29,088
  • 9
  • 83
  • 120
3

From Python documentation:

Almost all module functions depend on the basic function random(), which generates a random float uniformly in the semi-open range [0.0, 1.0).

Like almost every PRNG of float numbers..

Jack
  • 131,802
  • 30
  • 241
  • 343