0

If I generate n random numbers in the interval [0,1] then the mean will be around 0.5 and they will be uniformly distributed. How could an algorithm/formula look like if I want to get n random numbers still in the interval [0,1], however, e.g. with a mean of 0.6. They should still be distributed as uniformly as possible, however numbers bigger than > 0.5 a bit more frequently.

So far I have only found solutions, which would assume a different distribution, e.g. with a normal distribution it would be quite easy to have numbers around the desired mean, but then numbers which are much larger or much smaller will be much less frequent and I'd like to avoid that.

The programming language does not really matter. I am currently trying to do that with R however.

Volokh
  • 380
  • 3
  • 16
  • 3
    The description of the distribution is a bit fuzzy. You can't have a mean of 0.6 and uniform in the range [0, 1]. Do you want the PDF to be flat from 0 to 0.5, then step up and be flat again from 0.5 to 1? Do you want it to gradually increase from 0 to 1? Is it OK if it curves a bit? Maybe try sketching the PDF you want and including an image of it in your question. – Richie Cotton Dec 09 '20 at 21:30
  • One way to formalize this would be to think about criteria in terms of the CDF. (1) Must be a non-decreasing function on (0,1). (2) CDF(0)=0, CDF(1)=1 (range criterion). (3) integral (x*CDF dx) = m (mean criterion). (4) minimize the integrated second derivative (this one is trickier, but it's one way to operationalize "as uniform as possible". However, it rules out a piecewise-linear function. – Ben Bolker Dec 09 '20 at 22:39
  • 1
    One could also characterize "uniformity" as the variance of the PDF (I don't know what if any relationship that would have to the integrated second derivative of the CDF). This might make a good question for [CrossValidated](https://stats.stackexchange.com) ... – Ben Bolker Dec 09 '20 at 23:12
  • Does this answer your question? [Is there an efficient way to generate N random integers in a range that have a given sum or average?](https://stackoverflow.com/questions/61393463/is-there-an-efficient-way-to-generate-n-random-integers-in-a-range-that-have-a-g) – Peter O. Dec 10 '20 at 08:43

2 Answers2

4

This is more of a statistics question: you don't want a uniform distribution, but rather a different distribution that is similar but different from the uniform. Just with your explanations, there are different distributions that could correspond to what you ask, for example you could make a density function with a smooth slope between 0 and 1. Or you could have a "bump" around 0.6.

You should check out the beta distribution, which has properties similar to what you want. It has two shape parameters, that can make the distribution more bumpy if you want. And you can repametrize it* to input the desired mean.


x <- 0:200/100 - .5
plot(x, dunif(x), type="l", main = "Uniform")

plot(x, dbeta(x,1.1,1), type = "l", main = "Beta 1.1; 1")

plot(x, dbeta(x,1.3,1.1), type = "l", main = "Beta 1.3; 1.1")

Created on 2020-12-09 by the reprex package (v0.3.0)

  • Reparametrization: as per the linked Wikipedia article, we have these relationships:

    α = μν, β = (1 − μ)ν

Where μ is the mean, and ν a sample size parameter. So, if you want a given μ=0.06 you just need to choose a value of ν and that gives you the shape1 and shape2 parameters to feed in rbeta().

Alexlok
  • 2,999
  • 15
  • 20
1

You could do this by taking your sample first, then finding the number which, when the sample is raised to this power, gives it the desired mean. You can find this number using optimize and wrap it all in a handy function:

runif_skew <- function(n, mean) {
  y <- runif(n)
  o <- optimize(function(x) sapply(x, function(a) (mean(y^a) - mean)^2), 
                c(-10, 10))
  return(y^o$minimum)
}

So testing, we get:

set.seed(1234)

samp <- runif_skew(100, mean = 0.6)
samp
#>   [1] 0.30960945 0.77430422 0.76552230 0.77502862 0.92241558 0.78630999
#>   [7] 0.08116296 0.45539329 0.80322229 0.69863075 0.82094332 0.72083812
#>  [13] 0.50599764 0.95795377 0.51517442 0.90868101 0.50935602 0.49043546
#>  [19] 0.40456114 0.45505040 0.53784057 0.52495782 0.37103194 0.17624617
#>  [25] 0.44066837 0.89294028 0.70697395 0.95303388 0.90519278 0.18954096
#>  [31] 0.65484658 0.48881344 0.52680572 0.69352738 0.39794079 0.86223518
#>  [37] 0.42123912 0.48243924 0.99575933 0.89101011 0.72677936 0.79033774
#>  [43] 0.53343892 0.77398195 0.54978076 0.68960374 0.81035543 0.67690563
#>  [49] 0.46727665 0.86577235 0.24520062 0.53146373 0.83594108 0.69148941
#>  [55] 0.36335672 0.69103666 0.68362817 0.85703730 0.39023824 0.91515557
#>  [61] 0.92467723 0.18062297 0.53836223 0.09909591 0.46218794 0.82914424
#>  [67] 0.52998881 0.69444152 0.20229817 0.73470110 0.32085460 0.94070430
#>  [73] 0.10245695 0.87648793 0.27287373 0.70224102 0.59704639 0.23844073
#>  [79] 0.54152334 0.80478925 0.95961235 0.66699780 0.34984342 0.72033505
#>  [85] 0.41547888 0.94396313 0.60141737 0.53255866 0.37226639 0.94260574
#>  [91] 0.38017938 0.94500732 0.33838968 0.33502176 0.29703296 0.69667413
#>  [97] 0.52262060 0.14178325 0.53142748 0.85143495

hist(samp)

Note that the domain of the sample stays within (0, 1). And our mean is exactly right:

mean(samp)
#> [1] 0.6

Created on 2020-12-09 by the reprex package (v0.3.0)

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • A great solution, however, the numbers are not distributed "uniformly" (as other people stated unfortunately I was not precise enough about the definition). Small numbers are rare and bigger numbers are several times more frequent. – Volokh Dec 09 '20 at 22:58
  • @Volokh the numbers can't be distributed uniformly AND have a mean of 0.6 AND have a range of 0 to 1. You can pick any two of these three, but it's not mathematically or logically possible to have all 3. You said larger numbers could be a bit more frequent. This is just how much more frequent they need to be to give a mean of 0.6. – Allan Cameron Dec 10 '20 at 00:03
  • You are absolutely right. Bigger numbers have to be a bit more frequent if one wants to have a mean > 0.5. However, with this solution I have numbers close to 1.0 5x times more often than numbers close to 0. – Volokh Dec 10 '20 at 07:35
  • @Volokh yes; if you want the mean to be 0.6 that's how it _needs_ to be to get a mean of 0.6. Try it by hand. – Allan Cameron Dec 10 '20 at 07:55