0

I want to generate 50 numbers with rnorm, with two criteria. If there are numbers <99 or >101 (i.e., outside 99-101), omit them and run rnorm again until 50 total numbers that meet criteria are generated.

  1. This is unrelated to statistics. I'm simply trying to learn how to use while loops.
  2. I suspect the rnorm(50, part on line 3., is computationally inefficient -- any advice there would be great.

  3. The main problem I have is that although this code works, it goes on for ever. It needs to terminate when there are 50 observations that meet criteria. Thus far, I have tried unsuccessfully to use if and break to do this...

Code so far:

1.   z = rnorm(50, mean = 100, sd = 10)
2.   while( match(TRUE, z > 99) > 0 | match(TRUE, z < 101) > 0 )  {
3.   z = c( z[z >= 99 & z <= 101], rnorm(50, mean = 100, sd = 10 )) 
4.   }
lnNoam
  • 1,055
  • 11
  • 20
  • you can consider a random number generations with lower and upper bounds. See here http://stackoverflow.com/questions/19343133/setting-upper-and-lower-limits-in-rnorm – SabDeM May 30 '15 at 04:02

2 Answers2

1

The reason your loop "never quits" in its current form is because the stats are literally stacked against you.

The probability of consecutively generating 50 normally distributed numbers within ±1 standard deviation of the mean is approximately 4.22E-9. And your tolerance is only 1/10th of one standard deviation, so imagine how astronomically small your odds are.

The simplest way to enforce a fixed number of iterations of a loop is the for loop:

for (i in 1:50) {
  sum = sum + sum^0.5
}

Otherwise you can add a watchdog counter like follows:

z = rnorm(50, mean = 100, sd = 10)
wd = 0
while( match(TRUE, z > 99) > 0 | match(TRUE, z < 101) > 0 )  {
  z = c( z[z >= 99 & z <= 101], rnorm(50, mean = 100, sd = 10 ))
  wd = wd + 1
  if (wd == 50) { break }
}

Also keep in mind that loops in R are relatively slow (compared to more typical operations) and their use is discouraged unless there is a good reason. R is a functional language and most operators and functions are vectorized. You will find the vectorized operations perform substantially faster than equivalent loop-like / procedural flows.

Special Sauce
  • 5,338
  • 2
  • 27
  • 31
  • Thanks I was not trying to create 50 normally distributed numbers. Rather, I just wanted to see excluding occur for numbers outside 99-101 until I had 50 numbers. I have actually learned about watchdog counters before (e.g., in python)...why I didn't think of it here is a mystery to me. Anyway, thanks again for the help! – lnNoam May 30 '15 at 04:33
  • Ah, I see what you were trying to accomplish. Your current code won't work because the `c( )` command is doing a vector concatenation that is always adding 50 new normal numbers to `z`, so `z` is always growing in length. – Special Sauce May 30 '15 at 04:41
  • Yes, I fixed that. Thanks again for the help. – lnNoam May 30 '15 at 04:59
0

I figured out this solution too, which will also do it:

rnorm_rg = function(x, mean, sd, lwr, upr){
  z = rnorm(x, mean = mean, sd = sd)
  while ( length(z[z >= lwr & z <= upr]) < x ) {
    z = c( z[z >= lwr & z <= upr], rnorm(x, mean = mean, sd = sd) )
    z = z[z >= lwr & z <= upr]
  }
  print(z)
}

e.g.,

rnorm_rg(  x = 50
           , mean = 100
           , sd = 10
           , lwr = 99
           , upr = 101
)
lnNoam
  • 1,055
  • 11
  • 20