0

So I've gotten my histogram and my normal curve to plot, however the curve appears much smaller than the histogram bars. What am I doing wrong that it is so much smaller than it should be?

As you can see in my code I've plotted the histogram and tried two methods of plotting that normal curve. I had already calculated the sd and mean of the data set so I just used the actual numbers. The lines do plot, just way lower than they should be.

g = read.csv("C:/Users/emkat/Documents/decave.txt",header=FALSE)

g

m <- lapply(g,mean)

std <- sqrt(var(g))

hist(g[,1],plot = TRUE)
x <- g[,1]

y <- dnorm(x,mean = 26.59138,sd = 5.046878)

curve(dnorm(x,mean = 26.59138,sd = 5.046878),col="darkblue",lwd=2,add = TRUE)

lines(density(g[,1]),col="blue")

histogram

Z.Lin
  • 28,055
  • 6
  • 54
  • 94
emkat
  • 1
  • 1

1 Answers1

2

The Y-axis of histogram shows frequency of items in each bin. The frequency can take any value (even more than one). However if you have to match the histogram Y axis with that of probability density then you have to add an argument of "freq=FALSE".

For example in the code below, I have used the cars dataset to demonstrate the effect of using the named argument "freq=".

data_1 <- cars

## These settings are for plotting two curves side by side
par(mfrow=c(1,2))
## Without freq = FALSE
hist(data_1$dist,plot=TRUE,freq=TRUE)
lines(density(data_1$dist),col="red")

## With freq = FALSE
hist(data_1$dist,plot=TRUE,freq=FALSE)
lines(density(data_1$dist),col="blue")

The resulting image is as below. To the left is the red density curve which is flat because the Y-axis of histogram (which is more than 15) is too large for the probability density to show. To the right is the visible blue curve (due to "freq=FALSE" option).

enter image description here

I hope this was of some help. If the problem remains, please let us know.

Amit
  • 2,018
  • 1
  • 8
  • 12