1

I am have trouble with geom_text, i have two sets of points which I have coloured differently and fitted a line for both sets using a linear model.

When I try and add the line equation, pearson's correlation and spearman correlation, a letter appears in the legend and the text becomes pixelated. How can I correct this?

I don't get this issue when I plot them individually but I get them collectively.

library(ggplot2)
library(reshape2)

lm_eqn = function(df){
    m = lm(area ~ Mass, df);
    eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2, 
         list(a = format(coef(m)[1], digits = 2), 
              b = format(coef(m)[2], digits = 2), 
             r2 = format(summary(m)$r.squared, digits = 3)))
    as.character(as.expression(eq));                 
}

lm_eqn_PA = function(df){
    m = lm(area ~ Mass, df);
    eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2, 
         list(a = format(coef(m)[1], digits = 2), 
              b = format(coef(m)[2], digits = 2), 
             r2 = format(summary(m)$r.squared, digits = 3)))
    as.character(as.expression(eq));                 
}


dat <- read.table("../../mass_vs_area/mass_vs_area.txt",header=TRUE)
area <- read.table("../results/all_values.txt", header=TRUE)

all_dat <- merge(area,dat,by="file")

all_dat1 <- data.frame(file=all_dat$file,Mass=all_dat$Mass,area=all_dat$area,Cluster=c("Experimental"))
all_dat2 <- data.frame(file=all_dat$file,Mass=all_dat$Mass,area=all_dat$PA,Cluster=c("PA"))

plot_dat = rbind(all_dat1,all_dat2)

data.label.equation1 <- data.frame(x = 250,y = 20000,label = paste("Exp_eq.: ",lm_eqn(all_dat1)))
data.label.pearson1 <- data.frame(x = 250, y = 21000,label = paste("Exp_Peasron: ",format(cor(all_dat1$Mass,all_dat1$area),digits=3)))
data.label.spearman1 <- data.frame(x = 250, y = 22000, label = paste("Exp_Spearman: ",format(cor(all_dat1$Mass,all_dat1$area,method="spearman"),digits=3)))

data.label.equation2 <- data.frame(x = 250,y = 19000,label = paste("PA_eq.: ",lm_eqn_PA(all_dat2)))
data.label.pearson2 <- data.frame(x = 250, y = 18000,label = paste("PA_Peasron: ",format(cor(all_dat2$Mass,all_dat2$area),digits=3)))
data.label.spearman2 <- data.frame(x = 250, y = 17000, label = paste("PA_Spearman: ",format(cor(all_dat2$Mass,all_dat2$area,method="spearman"),digits=3)))


plotter <- (qplot(data=plot_dat, x=Mass, y=area, colour=factor(plot_dat$Cluster)) + geom_point(shape=1) + geom_smooth(method="lm") + scale_colour_discrete(name="Cross-Section Type"))

plotter <-  plotter + labs(x = "Mass kDa" ,y = expression(paste('CCS (',Å^2,')')))
plotter <- plotter + geom_text(data = data.label.equation1, aes(x = x , y = y, label = label) ,parse=TRUE)
plotter <- plotter + geom_text(data = data.label.pearson1, aes(x = x , y = y , label = label ),parse=TRUE)
plotter <- plotter + geom_text(data = data.label.spearman1, aes(x = x , y = y , label = label ),parse=TRUE)

plotter <- plotter + geom_text(data = data.label.equation2, aes(x = x , y = y , label = label ),parse=TRUE)
plotter <- plotter + geom_text(data = data.label.pearson2, aes(x = x , y = y , label = label ),parse=TRUE)
plotter <- plotter + geom_text(data = data.label.spearman2, aes(x = x , y = y , label = label ),parse=TRUE)

print(plotter)

enter image description here

EDIT the data frames are below:

all_dat1 table:
          file Mass  area      Cluster
1         1bsy   18  1660 Experimental
2         1fi9   12  1240 Experimental
3  1gr5_edited  801 20900 Experimental
4    1gyk_edit  125  7030 Experimental
5         1hwz  336 12800 Experimental
6  1jyc_edited  103  5550 Experimental
7   1pkn_model  237 10300 Experimental
8   1ttr_model   56  3410 Experimental
9   2a3y_model  250 10400 Experimental
10        2hcy  143  6940 Experimental
11  3blg_model   37  2850 Experimental
12  3fdc_model   64  3640 Experimental
13       4f5sA   69  4090 Experimental

all_dat2 table:

        file Mass  area Cluster
        1bsy   18  1428      PA
        1fi9   12  1119      PA
 1gr5_edited  801 21518      PA
   1gyk_edit  125  5592      PA
        1hwz  336 10759      PA
 1jyc_edited  103  4649      PA
  1pkn_model  237  8689      PA
  1ttr_model   56  2793      PA
  2a3y_model  250  8514      PA
        2hcy  143  7255      PA
  3blg_model   37  2399      PA
  3fdc_model   64  2833      PA
       4f5sA   69  3871      PA
Harpal
  • 12,057
  • 18
  • 61
  • 74
  • Can you provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), i.e., one including data? – Roland Dec 10 '12 at 13:23
  • I've added the data frames I used – Harpal Dec 10 '12 at 14:12
  • @Harpal you miss all_dat sample , isn't? – agstudy Dec 10 '12 at 14:31
  • the data needed to make the plot is taken from all_dat and put into all_dat1 and all_dat2. I then use rbind all_dat1 and all_dat2 to make plot_data. I dont use all_dat to plot any of the data. The data needed to re-create the plot are in all_dat1 and all_dat2 – Harpal Dec 10 '12 at 14:36
  • @Harpal ok I see that but the data needed for the equation isn't here. – agstudy Dec 10 '12 at 14:38
  • I corrected the code for it. – Harpal Dec 10 '12 at 14:55
  • 2
    By the way, it may not matter to you, but from a *statistical* point of view, there a couple of potential warning signals here -- (1) your 800-kDa points are probably very influential; (2) it looks like a quadratic model might be appropriate – Ben Bolker Dec 10 '12 at 15:39

1 Answers1

1

To remove the symbol from the legend , in all geom_text Add the option

   show_guide =FALSE
agstudy
  • 119,832
  • 17
  • 199
  • 261
  • Thanks, that got rid of the problem with the legend. I managed to fix the text problem by using `annotate` rather than geom_text. http://docs.ggplot2.org/0.9.2.1/annotate.html – Harpal Dec 10 '12 at 15:22