6

(Reproducible example added.) I am little bit confused about rnorm function. I expected mean(rnorm(100,mean=0,sd=1)) to be 0; and sd(rnorm(100,mean=0,sd=1)) to be 1. But gave different results. Where am I wrong?

Reproducible Example:

mean(rnorm(100,mean=0,sd=1))
# [1] 0.07872548
sd(rnorm(100,mean=0,sd=1))
# [1] 1.079348

Any help is greatly appreciated.

BMW
  • 42,880
  • 12
  • 99
  • 116
Erdogan CEVHER
  • 1,788
  • 1
  • 21
  • 40
  • 3
    Your sample size is to small, the larger the sample size, the closer the mean will get to 0 and the sd will get to 1. – John Paul Jan 05 '15 at 21:16
  • 2
    `rnorm` gives you random variables which have a normal distribution with a 0 mean and 1 SD. "Random" means these values have been taken from distribution randomly, thus it is possible that a bigger proportion was taken from the right side opposed to the left side (for example). Still, your `mean` and a `sd` are very close. The bigger your data set will be, the closer they'll get by the LLN theory. – David Arenburg Jan 05 '15 at 21:19
  • 1
    @JohnPaul, You should be right. But, interestingly, AFAIS, in "?rnorm" help documentation, there appears nothing as to the validity of this rnorm mean and sd only in the asympthotic case! – Erdogan CEVHER Jan 05 '15 at 21:20
  • 2
    Also, try running `plot(sapply(1L:1e4, function(x) mean(rnorm(x))))` or `plot(sapply(1L:1e4, function(x) sd(rnorm(x))))` – David Arenburg Jan 05 '15 at 21:28
  • @DavidArenburg, Your technique is really very helpful not only for this problem but also for those who wonder long term behaviour in another problems. Thx. – Erdogan CEVHER Jan 05 '15 at 21:36
  • 1
    very closely related: http://stackoverflow.com/questions/18919091/r-generate-random-numbers-with-fixed-mean-and-sd – Ben Bolker Jan 05 '15 at 22:08
  • @BenBolker, I admired your miraculous solution at that link as well. I will make use of it that one as well. – Erdogan CEVHER Jan 05 '15 at 22:15

3 Answers3

13

rnorm(100) gives you a random sample of 100 values from distribution mean = 0 and sd = 1. Because it is random, the actual value of mean(rnorm(100)) depends on which particular values you get back. There is no guarantee that the mean will be 0, but statistically it should converge to 0 as you use larger sample sizes. For example, try mean(rnorm(10000)); it will probably be closer to 0 than before.

Edit: If you want to force the sample to have a particular mean and standard deviation, check out this question: "Generate random numbers with fixed mean and sd".

Community
  • 1
  • 1
rsoren
  • 4,036
  • 3
  • 26
  • 37
  • I thought rnorm(100,mean=0,sd=1) as "100 values whose mean is 0 and sd is 1", not as "100 values from a distribution with mean=0 and sd=1". Your explanation is very clear. I got it. Thanks a lot. – Erdogan CEVHER Jan 05 '15 at 21:27
0

This is due to noise. I would suggest to try with larger sets to approach the target, or change the seed to see various results.

MAC
  • 419
  • 3
  • 9
0

rnorm creates random deviates.

set.seed(4)
x <- rnorm(5, mean=0, sd=1)
x
# [1]  0.2167549 -0.5424926  0.8911446  0.5959806  1.6356180
mean(c(0.2167549, -0.5424926, 0.8911446, 0.5959806, 1.6356180))
# [1] 0.5594011
JasonAizkalns
  • 20,243
  • 8
  • 57
  • 116