setting upper and lower limits in rnorm

Question

I am simulating data using rnorm, but I need to set an upper and lower limit, does anyone know how to do this?

code:

rnorm(n = 10, mean = 39.74, sd = 25.09)

Upper limit needs to be 340, and the lower limit 0

I am asking this question because I am rewriting an SAS-code into an R-code. I have never used SAS. I am trying to rewrite the following piece of code:

sim_sample(simtot=100000,seed=10004,lbound=0,ubound=340,round_y=0.01,round_m=0.01,round_sd=0.01,n=15,m=39.74,sd=25.11,mk=4)

It's not clear what you want to do. The normal distribution is unlimited by definition. Do you want a different distribution (e.g., `runif` that lets you define limits) or discard values outside of you limits? Please clarify. — Roland, Oct 13 '13 at 08:27
I want to sample data from a normal distribution, but with limits on the range. In this case, I have a mean of 39.74 and a SD of 25.09, and I need to sample data with this mean and standard deviation, but the numbers cannoot exceed 340. Do I need to use runif for this? — Jolanda Kossakowski, Oct 13 '13 at 08:42
You don't seem to know what you need. That makes it hard to answer your question. — Roland, Oct 13 '13 at 08:56
I do know what I need, I just don't know how to get it. I have a SAS code that I am trying to rewrite in R, and in that SAS code the writer has an upper and a lower limit while sampling with a certain mean and standard deviation. Is sample() an option? I have put the SAS code that I am trying to rewrite in the question — Jolanda Kossakowski, Oct 13 '13 at 10:14

score 32 · Answer 1 · answered Oct 13 '13 at 21:34

32

The rtruncnorm() function will return the results you need.

  library(truncnorm)
  rtruncnorm(n=10, a=0, b=340, mean=39.4, sd=25.09)

answered Oct 13 '13 at 21:34

charlie

602
4
12

score 15 · Answer 2 · answered Oct 13 '13 at 21:46

15

You can make your own truncated normal sampler that doesn't require you to throw out observations quite simply

rtnorm <- function(n, mean, sd, a = -Inf, b = Inf){
    qnorm(runif(n, pnorm(a, mean, sd), pnorm(b, mean, sd)), mean, sd)
}

answered Oct 13 '13 at 21:46

Dason

60,663
9
131
148

score 10 · Accepted Answer · edited Jun 20 '20 at 09:12

Like this?

mysamp <- function(n, m, s, lwr, upr, nnorm) {
  samp <- rnorm(nnorm, m, s)
  samp <- samp[samp >= lwr & samp <= upr]
  if (length(samp) >= n) {
    return(sample(samp, n))
  }  
  stop(simpleError("Not enough values to sample from. Try increasing nnorm."))
}

set.seed(42)
mysamp(n=10, m=39.74, s=25.09, lwr=0, upr=340, nnorm=1000)
#[1] 58.90437 38.72318 19.64453 20.24153 39.41130 12.80199 59.88558 30.88578 19.66092 32.46025

However, the result is not normal distributed and usually won't have the mean and sd you've specified (in particular if the limits are not symmetric around the specified mean).

Edit:

According to your comment it seems you want to translate this SAS function. I am not an SAS user, but this should do more or less the same:

mysamp <- function(n, m, s, lwr, upr, rounding) {
  samp <- round(rnorm(n, m, s), rounding)
  samp[samp < lwr] <- lwr
  samp[samp > upr] <- upr
  samp
}

set.seed(8)
mysamp(n=10, m=39.74, s=25.09, lwr=0, upr=340, rounding=3)
#[1] 37.618 60.826 28.111 25.920 58.207 37.033 35.467 12.434  0.000 24.857

You may then want to use replicate to run the simulations. Or if you want faster code:

sim <- matrix(mysamp(n=10*10, m=39.74, s=25.09, lwr=0, upr=340, rounding=3), 10)
means <- colMeans(sim)
sds <- apply(sim, 2, sd)

score 0 · Answer 4 · answered Oct 13 '13 at 08:55

0

Assuming you want exactly 10 numbers and not the subset of them that is >0, <340 (and night not be a normal distribution):

    aa <- rnorm(n = 10, mean = 39.74, s = 25.09)

    while(any(aa<0 | aa>340)) { aa <- rnorm(n = 10, mean = 39.74, s = 25.09) }

answered Oct 13 '13 at 08:55

alexis_laz

12,884
4
27
37

Depending on n and the limits set this function could take a *very* long time to generate the sample of interest – Dason Oct 13 '13 at 21:52

score 0 · Answer 5 · answered Feb 22 '15 at 18:45

This is the function that I wrote to achieve the same purpose. It normalizes the result from the rnorm function and then adjusts it to fit the range.

NOTE: The standard deviation and mean (if specified) get altered during the normalization process.

#' Creates a random normal distribution within the specified bounds.
#' 
#' WARNING: This function does not preserve the standard deviation or mean.
#' @param n The number of values to be generated
#' @param mean The mean of the distribution
#' @param sd The standard deviation of the distribution
#' @param lower The lower limit of the distribution
#' @param upper The upper limit of the distribution
rtnorm <- function(n, mean=NA, sd=1, lower=-1, upper=1){
  mean = ifelse(is.na(mean)|| mean < lower || mean > upper,
                mean(c(lower, upper)), mean)
  data <- rnorm(n, mean=m, sd=sd) # data

  if (!is.na(lower) && !is.na(upper)){ # adjust data to specified range
    drange <- range(data)           # data range
    irange <- range(lower, upper)   # input range
    data <- (data - drange[1])/(drange[2] - drange[1]) # normalize data (make it 0 to 1)
    data <- (data * (irange[2] - irange[1]))+irange[1] # adjust to specified range
  }
  return(data)
}

Note that this isn't the same as what is typically referred to as a truncated normal. — Dason, Apr 08 '19 at 18:19
@Dason: No, it is not a truncated normal distribution. It rather expands/reduces the bounds of a standard normal distribution to meet the specified upper and lower limits. — Alex Essilfie, Apr 09 '19 at 15:21

score 0 · Answer 6 · answered Apr 07 '21 at 10:02

There are several ways to set upper and lower limits to a normal distribution, what will cause that the result is no longer normal distributed.

Assuming a mean=0, sd=1 producing N=1e5 values with a lower boundary of LO=-1 and an upper boundary of UP=2.

N <- 1e5L
LO <- -1
UP <- 2

Move outliers to border (@Roland)

set.seed(42)
x <- pmax(LO, pmin(UP, rnorm(N)))
mean(x)
#[1] 0.07238029
median(x)
#[1] -0.002066374
sd(x)
#[1] 0.8457605
hist(x, 30)

Cut outliers of (@Dason, @Roland, truncnorm::rtruncnorm, MCMCglmm::rtnorm)

set.seed(42)
x <- qnorm(runif(N, pnorm(LO), pnorm(UP)))
mean(x)
#[1] 0.2317875
median(x)
#[1] 0.173679
sd(x)
#[1] 0.7236536

Scale (@Alex Essilfie)

set.seed(42)
x <- rnorm(N)
x <- (x-min(x))/(max(x)-min(x))*(UP-LO)+LO
mean(x)
#[1] 0.4474876
median(x)
#[1] 0.4482257
sd(x)
#[1] 0.3595199

Combination of methods. E.g. Cut and scale:

set.seed(42)
x <- qnorm(runif(N, pnorm(-3), pnorm(3)))
x <- (x-min(x))/(max(x)-min(x))*(UP-LO)+LO
mean(x)
#[1] 0.5010759
median(x)
#[1] 0.5014713
sd(x)
#[1] 0.4957751

Asymmetric combination

set.seed(42)
n <- round(N*abs(LO)/diff(range(c(LO, UP))))
x <- c(qnorm(runif(n, pnorm(-3), 0.5)), qnorm(runif(N-n, 0.5, pnorm(3))))
x <- ifelse(x < 0, x/min(x)*LO, x/max(x)*UP)
mean(x)
#[1] 0.2651627
median(x)
#[1] 0.2127903
sd(x)
#[1] 0.5078264

setting upper and lower limits in rnorm

6 Answers6

Edit:

Linked

Related