1

I need to generate random numbers between 100 and 500 where the mean is 150. I've been reading on distribution curves and probability density functions but can't find a concrete solution to this question. Any help is appreciated.

  • 2
    Welcome to stackoverflow.com. Please take some time to read [the help pages](http://stackoverflow.com/help), especially the sections named ["What topics can I ask about here?"](http://stackoverflow.com/help/on-topic) and ["What types of questions should I avoid asking?"](http://stackoverflow.com/help/dont-ask). Also please take the [tour] and read about [ask] good questions. Lastly please read [this question checklist](https://codeblog.jonskeet.uk/2012/11/24/stack-overflow-question-checklist/). – Some programmer dude Sep 16 '22 at 06:36
  • 1
    And if you want a pure mathematical solution, then please see [the Math SE site](https://math.stackexchange.com/help/on-topic). – Some programmer dude Sep 16 '22 at 06:47
  • Do you know which distribution you want? A uniform distribution (used by for instance `random.nextInt(...)` ) always has the mean in the middle of the range, so that won't work. – Alexander Torstling Sep 16 '22 at 07:13
  • Does this answer your question? [Is there an efficient way to generate N random integers in a range that have a given sum or average?](https://stackoverflow.com/questions/61393463/is-there-an-efficient-way-to-generate-n-random-integers-in-a-range-that-have-a-g) – Peter O. Sep 16 '22 at 09:53
  • @PeterO. Requiring integers instead of floating point numbers adds quite a severe additional constraint not mentioned here. – MvG Sep 16 '22 at 14:19

2 Answers2

2

I can think of two possible approaches you might take. One would be to take something like a normal distribution but that might lead to values outside your range. You could discard those samples and try again, but doing so would alter your mean. So this is probably the more complicated approach.

The other alternative, the one I would actually suggest, is to start with a uniform distribution in the range 0 to 1. Then transform that number so the result has the properties you want.

There are many such transformations you could employ. In the absence of any strong rationale for something else, I would probably go for some formula such as

y = pow(x, a) * b + c

In that formula x would be uniformly distributed in the [0, 1] range, while y should have the bounds and mean you want, once the three parameters have been tuned correctly.

Using b=400 and c=100 you can match the endpoints quite easily, because a number from the [0, 1] range raised to any power will again be a number from that range. So now all you need is determine a. Reversing the effect of b and c you want pow(x, a) to have an mean of (150 - c) / b = 1/8 = 0.125.

To compute the mean (or expected value) in a discrete distribution, you multiply each value with its probability and sum them up. In the case of a continuous distribution that becomes the integral over value times probability density function. The density function of our uniform distribution is 1 in the interval and 0 elsewhere. So we just need to integrate pow(x, a) from 0 to 1. The result of that is 1 / (a + 1) so you get

1 / (a + 1) = 1 / 8
     a + 1  =     8
     a      =     7

So taking it all together I'd suggest

return Math.pow(random.nextDouble(), 7) * 400 + 100

If you want to get a feeling for this distribution, you can consider

x = pow((y - c) / b, 1 / a)

to be the cumulative distribution function. The density would be the derivative of that with respect to y. Ask a computer algebra system and you get some ugly formula. You might as well ask directly for a plot, e.g. on Wolfram Alpha.

That probability density is actually infinite at 100, and then drops quickly for larger values. So one thing you don't get from this approach is a density maximum at 150. If you had wanted that, you'd need a different approach but getting both density maximum and expected value at 150 feels really tricky.

One more thing to consider would be reversing the orientation. If you start with b=-400 and c=500 you get a=1/7. That's a different distribution, with different properties but the same bounds and mean. If I find the time I'll try to plot a comparison for both of these.

MvG
  • 57,380
  • 22
  • 148
  • 276
  • Thanks for this extremely detailed answer! I do not know Math a lot but I can still follow your writeup. Its okay if density max is not 150 but it should be mostly around 150 only. – Bad Scientist Sep 16 '22 at 07:53
0

Create a List of numbers that have the mean value you want, they can be in any order to reach just this property (so you could just count up or whatever).

Then use Collections.shuffle to randomize the order of the list.

cyberbrain
  • 3,433
  • 1
  • 12
  • 22