5

I have some problems with ggplot legends, here is my first code with only the legend for corrGenes, which is fine.

gene1=c(1.041,0.699,0.602,0.602,2.585,0.602,1.000,0.602,1.230,1.176,0.699,0.477,1.322)
    BIME = c(0.477,0.477,0.301,0.477,2.398,0.301,0.602,0.301,0.602,0.699,0.602,0.477,1.176)
    corrGenes=c(0.922,0.982,0.934,0.917,0.993,0.697,0.000,0.440,0.859,0.788,0.912,0.687,0.894)

DF=data.frame(gene1,BIME,corrGenes)

plot= ggplot(data=DF,aes(x=gene1,y=BIME))+
  geom_point(aes(colour=corrGenes),size=5)+
  ylab("BIME normalized counts (log10(RPKM))")+
  xlab("gene1 normalized counts (log10(RPKM))")

When I add abline and smooth, I get the correct plot with :

plot= ggplot(data=DF,aes(x=gene1,y=BIME))+
  geom_point(aes(colour=corrGenes),size=5)+
  geom_abline(intercept=0, slope=1)+
  stat_smooth(method = "lm",se=FALSE)+
  ylab("BIME normalized counts (log10(RPKM))")+
  xlab("gene1 normalized counts (log10(RPKM))")

but no way to get the legend for them, I tried and many other combinations:

plot= ggplot(data=DF,aes(x=gene1,y=BIME))+
  geom_point(aes(colour=corrGenes),size=5)+
  geom_abline(aes(colour="best"),intercept=0, slope=1)+
  stat_smooth(aes(colour="data"),method = "lm",se=FALSE)+
  scale_colour_manual(name="Fit", values=c("data"="blue", "best"="black"))+
  ylab("BIME normalized counts (log10(RPKM))")+
  xlab("gene1 normalized counts (log10(RPKM))")

If anyone has an idea to solve this tiny but very annoying problem, it would be very helpfull!

Mesmer
  • 319
  • 1
  • 4
  • 17

3 Answers3

3

Finally, I found anther way using a trick. First, I've computed the linear regression and convert the results to a data frame which I add my best fit (Intercept = 0 and slope =1), then I added a column for type of data (data or best).

modele = lm(BIME ~ gene1, data=DF)
coefs = data.frame(intercept=coef(modele)[1],slope=coef(modele)[2])
coefs= rbind(coefs,list(0,1))
regression=as.factor(c('data','best'))
coefs=cbind(coefs,regression)

then I plotted it with a unique geom_abline command and moving the DF from ggplot() to geom_point() and used the linetype parameter to differenciate the two lines :

plot = ggplot()+
  geom_point(data=pointSameStrandDF,aes(x=gene1,y=BIME,colour=corrGenes),size=5)+
  geom_abline(data=coefs, aes(intercept=intercept,slope=slope,linetype=regression), show_guide=TRUE)+
  ylab("BIME normalized counts (log10(RPKM))")+
  xlab("gene1 normalized counts (log10(RPKM))")

There is maybe a way to use colors for those 2 lines, but I can't find out how?

Thanks for your help guys!

Mesmer
  • 319
  • 1
  • 4
  • 17
2

The show_guide=TRUE argument should display the legends for both geom_abline and stat_smooth. Try running the below code.

plot= ggplot(data=DF,aes(x=gene1,y=BIME))+
geom_point(aes(colour=corrGenes),size=5)+
geom_abline(aes(colour="best"),intercept=0, slope=1, show_guide=TRUE)+
stat_smooth(aes(colour="data"),method = "lm",se=FALSE, show_guide=TRUE)+
scale_colour_manual(name="Fit", values=c("data"="blue", "best"="black"))+
ylab("BIME normalized counts (log10(RPKM))")+
xlab("gene1 normalized counts (log10(RPKM))")
figurine
  • 746
  • 9
  • 22
  • Thanks for the answer, I've already try that but I've got the same error : `Error: Continuous value supplied to discrete scale` – Mesmer Apr 24 '15 at 10:03
  • If you remove the scale_colour_manual line does that resolve the problem at all? – figurine Apr 24 '15 at 10:25
  • @Mesmer `Error: Continuous value supplied to discrete scale` most of the times means that `R` is reading your x-values given by characters as continuous instead. Try `x <- as.factor(x)` and then plot. – gented Oct 18 '15 at 21:24
2

Not sure if this is the best solution, but I was able to tell ggplot to have two scales, one for the colours (your points), the other one for the fill colour. Which fill colour you are probably asking? The one I added in the aes for the two lines:

plot = ggplot(data=DF,aes(x=gene1,y=BIME)) + 
  geom_point(size=5, aes(colour=corrGenes)) +
  geom_abline(aes(fill="black"),intercept=0, slope=1) + 
  stat_smooth(aes(fill="blue"), method = "lm",se=FALSE) +
  scale_fill_manual(name='My Lines', values=c("black", "blue"))+
  ylab("BIME normalized counts (log10(RPKM))")+
  xlab("gene1 normalized counts (log10(RPKM))")

enter image description here

mucio
  • 7,014
  • 1
  • 21
  • 33
  • Thanks mucio, it's far better, but there is still a small problem which I don't understand, why the colors legend are blue and not back and blue as you specify `geom_abline(aes(fill="black"),intercept=0, slope=1)`, `stat_smooth(aes(fill="blue"), method = "lm",se=FALSE)` and link it to `scale_fill_manual(name='My Lines', values=c("black", "blue"))` – Mesmer Apr 24 '15 at 12:33