R overlap normal curve to probability histogram

Question

In R I'm able to overlap a normal curve to a density histogram: Eventually I can convert the density histogram to a probability one:

a <- rnorm(1:100)
test <-hist(a,  plot=FALSE)
test$counts=(test$counts/sum(test$counts))*100   # Probability
plot(test, ylab="Probability")
curve(dnorm(x, mean=mean(a), sd=sd(a)), add=TRUE)

But I cannot overlap the normal curve anymore since it goes off scale.

Any solution? Maybe a second Y-axis

Possible duplicate: http://stackoverflow.com/q/20078107/903061 (at least strongly related). — Gregor Thomas, Oct 12 '15 at 16:03

LyzandeR · Accepted Answer · 2015-10-12T15:37:56.877

4

Now the question is clear to me. Indeed a second y-axis seems to be the best choice for this as the two data sets have completely different scales.

In order to do this you could do:

set.seed(2)
a <- rnorm(1:100)
test <-hist(a,  plot=FALSE)
test$counts=(test$counts/sum(test$counts))*100   # Probability
plot(test, ylab="Probability")
#start new graph
par(new=TRUE)
#instead of using curve just use plot and create the data your-self
#this way below is how curve works internally anyway
curve_data <- dnorm(seq(-2, 2, 0.01), mean=mean(a), sd=sd(a))
#plot the line with no axes or labels
plot(seq(-2, 2, 0.01), curve_data, axes=FALSE, xlab='', ylab='', type='l', col='red' )
#add these now with axis
axis(4, at=pretty(range(curve_data)))

Output:

edited Oct 12 '15 at 15:37

answered Oct 12 '15 at 14:16

LyzandeR

37,047
12
77
87

I think I should only have used one curve instead of two but the solution is the same. – LyzandeR Oct 12 '15 at 14:25
Thanks, probably I didn't explain it correctly. I want the Y-axis to be the probability and not the density. Then I need to overlap the normal curve. Should I use another Y-axis perhaps ? – Oct 12 '15 at 14:46
1

@g256 I still don't get the problem. It seems like both your graphs have limits between 0 and 1. Why would you need a second y-axis? Even density is a probability. – LyzandeR Oct 12 '15 at 14:49
@g256 Also if the problem is that the density varies between 0-1 whereas the curve can go beyond 1 you shouldn't be using `curve(dnorm(x, ...` but something that shows this difference... – LyzandeR Oct 12 '15 at 14:53
1

Thanks this is the answer to my question. – Oct 12 '15 at 15:44

score 2 · Answer 2 · edited Oct 12 '15 at 14:20

2

At first you should save your rnorm data otherwise you get different data each time.

seed = rnorm(100)

Next go ahead with

hist(seed,probability = T)
curve(dnorm(x, mean=mean(na.omit(seed)), sd=sd(na.omit(seed))), add=TRUE)

Now you have the expected result. Histogram with density curve.

edited Oct 12 '15 at 14:20

Ben Bolker

211,554
25
370
453

answered Oct 12 '15 at 14:13

Patrick C.

2,221
1
11
15

Thanks, probably I didn't explain it correctly. I want the Y-axis to be the probability and not the density. Then I need to overlap the normal curve. Should I use another Y-axis perhaps ? – Oct 12 '15 at 14:35
Manually set y-axis to interval [0,1] with ylim =c(0,1). Than add the probability function with curve(,..., add=T) should fit in the same plot. – Patrick C. Oct 12 '15 at 15:16

score 0 · Answer 3 · answered Oct 12 '15 at 15:59

The y-axis isn't a "probability" as you have labeled it. It is count data. If you convert your histogram to probabilities, you shouldn't have a problem:

x <- rnorm(1000)
hist(x, freq= FALSE, ylab= "Probability")
curve(dnorm(x, mean=mean(x), sd=sd(x)), add=TRUE)

R overlap normal curve to probability histogram

3 Answers3