9

I'm trying to create a histogram with two superimposed density plots. The problem: is I want one density to be a dashed line, which works perfectly but in the legend the dashed line will not appear, as in the following example

x<-sort(rnorm(1000))
data<-data.frame(x=x,Normal=dnorm(x,mean(x),sd=sd(x)),Student=dt(x,df=3))

ggplot(data,aes(y=x))+geom_histogram(aes(x=x,y=..density..),
color="black",fill="darkgrey")+geom_line(aes(x=x,y=Normal,color="Normal"),size=1,
linetype=2)+ylab("")+xlab("")+labs(title="Density estimations")+geom_line(aes(x=x,y=Student,color="Student"),size=1)+
scale_color_manual(values=c("Student"="black","Normal"="black"))

Any ideas how I get the dashed line in the legend?

Thank you very much!

Rainer

Example Plot

rainer
  • 929
  • 2
  • 14
  • 25

2 Answers2

6

The "ggplot" way generally likes data to be in "long" format with separate columns to specify each aesthetic. In this case, linetype should be interpreted as an aesthetic. The easiest way to deal with this is to prep your data into the appropriate format with reshape2 package:

library(reshape2)
data.m <- melt(data, measure.vars = c("Normal", "Student"), id.vars = "x")

And then modify your plotting code to look something like this:

ggplot(data,aes(y=x)) +
  geom_histogram(aes(x=x,y=..density..),color="black",fill="darkgrey") +
  geom_line(data = data.m, aes(x = x, y = value, linetype = variable), size = 1) +
  ylab("") +
  xlab("") +
  labs(title="Density estimations")

Results in something like this:

enter image description here

Chase
  • 67,710
  • 18
  • 144
  • 161
1

You want to reshape this to long format ...makes it simpler

x<-sort(rnorm(1000))
Normal=dnorm(x,mean(x),sd=sd(x))
Student=dt(x,df=3)
y= c(Normal,Student)
DistBn= rep(c('Normal', 'Student'), each=1000)
# don't call it 'data' that is an R command
df<-data.frame(x=x,y=y, DistBn=DistBn)

head(df)
          x           y DistBn
1 -2.986430 0.005170920 Normal
2 -2.957834 0.005621358 Normal
3 -2.680157 0.012126747 Normal
4 -2.601635 0.014864165 Normal
5 -2.544302 0.017179353 Normal
6 -2.484082 0.019930239 Normal   



ggplot(df,aes(x=x, y=y))+
  geom_histogram(aes(x=x,y=..density..),color="black",fill="darkgrey")+
  geom_line(aes(x=x,y=y,linetype=DistBn))+
  ylab("")+xlab("")+labs(title="Density estimations")+
  scale_color_manual(values=c("Student"="black","Normal"="black"))

Rplot

Stephen Henderson
  • 6,340
  • 3
  • 27
  • 33
  • don't disparage the F distribution! `?df` is an R command too :) – Chase Nov 26 '12 at 20:57
  • mine was too until someone pointed that out to me...there's a pretty detailed post on SO here that illustrates that overwriting R function names isn't actually *that* bad as R is pretty smart at figuring out what you really want to do...still probably best practice to avoid it - but is inevitable with 4000+ contributes packages and many more functions. – Chase Nov 26 '12 at 21:28
  • Thank you both very much! Great answers. – rainer Nov 27 '12 at 08:02