0

so my question is at Java but it can be in any programming language. there is this declaration :

Random rnd = new Random();

We want to get a random number at range 0 to x

I want to know if there is any mathematical difference between the following:

rnd.nextInt() % x;

and

rnd.nextInt(x)

The main question is, are one of these solutions more random than the other? Is one solution more appropriate or "correct" than the other? If they are equal I will be happy to see the mathematics proof for it

Rogue
  • 11,105
  • 5
  • 45
  • 71
lidorag
  • 69
  • 5
  • 3
    What do you mean by "is it more options"? Using `%` has worse statistical properties. – user2357112 Nov 18 '20 at 18:02
  • My lecturer claims that using the modulo has more options that are not damaged in the same number twice if we want a number from 0 to 100 for example. I claimed it wasn't true and I'm looking for a way to prove it. Or I'm wrong. – lidorag Nov 18 '20 at 18:04
  • I'm sorry for my English, what I'm trying to say is let's say we want to fill an array of 10 with random numbers from 0 to 100. I want to use and.nextInt(100), but my teacher said that using rnd.nextInt() % 100 will give us more random numbers and the chance to get the same number twice reduced, I said it's wrong but I don't know how to explain it, I want to know if I'm right or not and what the math way to proof it – lidorag Nov 18 '20 at 18:07
  • 3
    To help illustrate the problem, let's use a smaller number space, e.g. let us pretend that `nextInt()` is actually returning `byte` value. In that case, `nextInt()` (no arg) returns a number in range `-128` to `127`, so if `x = 100`, `nextInt() % x` will return `-99` to `99`, and the numbers between `-28` and `27` are twice as likely as the rest. In comparison, `nextInt(x)` will return `0` to `99`, and all the numbers are equally likely. *Conclusion:* You teacher is dead wrong, the chance to get the same number twice is *increased* when using `%`, and the numbers aren't even in the same range. – Andreas Nov 18 '20 at 18:33
  • 1
    Why was this closed? The question may not be perfect, but there's enough to go off of to write an answer. I second Andreas' conclusion that the teacher is dead wrong: short of a random number generator with a _uniform probability distribution_, you'll have a very hard time proving it doesn't cause the issue to compound. Not to mention if `N % n > 0`, then any `a` in `N` would more likely be towards the lower end of `a % n` – Rogue Nov 18 '20 at 18:36
  • 2
    ``rnd = new Random(x);`` creates a new random number generator object. ``rnd.nextInt() % x;`` uses this random number generator. I would not use the % (remainder) operator, but rather use rnd.nextInt(x). – NomadMaker Nov 18 '20 at 19:02
  • Fun fact, `% x` is uniformly distributed if its a power of two. The implementation of `nextInt(int bound)` uses this fact. Additionally, it will actually just call `nextInt()` (simplified) in a loop until the outcome is not part of the last integer sequence (which would result in non uniform) and then basically just apply `% x` to it. So the actual implementation is very close to `% x`, just makes sure that its uniformly distributed. – Zabuzard Nov 18 '20 at 20:47
  • `rnd = new Random(x)` is in no way equivalent to `rnd.nextInt() % x`. It is unclear what you're asking. – Mark Rotteveel Nov 19 '20 at 13:16

2 Answers2

4

Welcome to "mathematical insight" with "MS Paint".

So, from a statistical standpoint, it would depend on the distribution of the numbers being generated. First of all, we'll treat the probability of any one number coming up as an independant event (aka discarding the seed, which RNG, etc). Following that, a modulus simply takes a range of numbers (e.g. a from N, where 0<=a<N), and subdivides them based on the divisor (the x in a % x). While the numbers are technically from a discrete population (integers), the range of integers for a probability mass function would be so large that it'd end up looking like a continuous graph anyhow. So let's consider a graph of the probability distribution function for a range of numbers:

Example Distributions

If your random number generator doesn't generate with a uniform distribution across the range of numbers (aka, any number is as likely to come up as another number), then modulo would (potentially) be breaking up the results of a non-uniform distribution. When you consider the individual integers in those ranges as discrete (and individual) outcomes, the probability of any number i (0 <= i < x) being the result is the multiplication of the individual probabilities (i_1 * i_2 * ... * i_(N/x)). To think of it another way, if we overlaid the subdivisions of the ranges, it's plain to see that in non-symmetric distributions, it's much more likely that a modulo would not result in equally likely outcomes:

enter image description here

Remember, the likelihood of an outcome i in the graph above would be achieved through multiplying the likelihood of the individuals numbers (i_1, ..., i_(N/x)) in the range N that could result in i. For further clarity, if your range N doesn't evenly divide by the modular divisor x, there will always be some amount of numbers N % x that will have 1 additional integer that could produce their result. This means that most modulus divisors that aren't a power of 2 (and similarly, ranges that are not a multiple of their divisor) could be skewed towards their lower results, regardless of having a uniform distribution:

enter image description here

So to summarize the point, Random#nextInt(int bound) takes all of these things (and more!) into consideration, and will consistently produce an outcome with uniform probability across the range of bound. Random#nextInt() % bound is only a halfway step that works in some specific scenarios. To your teacher's point, I would argue it's more likely you'll see some specific subset of numbers when using the modulus approach, not less.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Rogue
  • 11,105
  • 5
  • 45
  • 71
3

new Random(x) just creates the Random object with the given seed, it does not ifself yield a random value.

I presume you are asking what the difference is between nextInt() % x and nextInt(x).

The difference is as follows.

nextInt(x)

nextInt(x) yields a random number n where 0 ≤ n < x, evenly distributed.

nextInt() % x

nextInt() % x yields a random number in the full integer range1, and then applies modulo x. The full integer range includes negative numbers, so the result could also be a negative number. With other words, the range is −x < n < x.

Furthermore, the distribution is not even in by far the most cases. nextInt() has 232 possibilities, but, for simplicity's sake, let's assume it has 24 = 16 possibilities, and we choose x not to be 16 or greater. Let's assume that x is 10.

All possibilities are 0, 1, 2, …, 14, 15, 16. After applying the modulo 10, the results are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5. That means that some numbers have a greater likelihood to occur than others. That also means that the change of some numbers occurring twice has increased.

As we see, nextInt() % x has two problems:

  1. Range is not as required.
  2. Uneven distribution.

So you should definetely use nextInt(int bound) here. If the requirement is get only unique numbers, you must exclude the numbers already drawn from the number generator. See also Generating Unique Random Numbers in Java.


1 According to the Javadoc.

MC Emperor
  • 22,334
  • 15
  • 80
  • 130