0

I'm trying to plot a graph between two columns of data from the data frame called "final". I want the p value and r^2 value to show up on the graph.

I'm using this function and code, but it gives me the error "cannot find y value"

library(ggplot2)
lm_eqn <- function(final, x, y){
        m <- lm(final[,y] ~ final[,x])

output <- paste("r.squared = ", round(summary(m)$adj.r.squared, digits = 4), " | p.value = ", formatC(summary(m)$coefficients[8], format = "e", digits = 4))
   return(output)
     }

output_plot <- lm_eqn(final, x, y)

p1 <- ggplot(final, aes(x=ENSG00000153563, y= ENSG00000163599)) + geom_point() + geom_smooth(method=lm, se=FALSE) + labs(x = "CD8A", y = "CTLA-4") + ggtitle("CD8 v/s CTLA-4", subtitle = paste("Linear Regression of Expression |", output_plot))

How do I get both columns of data x and y to flow through the function and for the graph to plot with the p value and residual value printed on the graph?

Thanks in advance.

  • Please [see here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on making an R question folks can help with, including a representative sample of your data – camille Aug 12 '18 at 04:10

1 Answers1

0

When you call function for output_plot generation you have to use the same ENS... variables as in your plot. After simplifying slightly function, should work now

   library(stats)
   library(ggplot2)
   lm_eqn <- function(x, y){
     m <- lm(y ~ x)
     output <- paste("r.squared = ", round(summary(m)$adj.r.squared, digits = 4), " | p.value = ", formatC(summary(m)$coefficients[8], format = "e", digits = 4))
     return(output)
   }
   x <-c(1,2,5,2,3,6,7,0)
   y <-c(2,3,5,9,8,3,3,1)
   final <- data_frame(x,y)

   output_plot <- lm_eqn(x, y)

   p1 <- ggplot(final, aes(x=x, y= y)) + geom_point() + geom_smooth(method=lm, se=FALSE) + labs(x = "x", y = "y") + ggtitle("CD8 v/s CTLA-4", subtitle = paste("Linear Regression of Expression |", output_plot)) 
Nar
  • 648
  • 4
  • 8
  • Thanks so much this works!! But I have multiple graphs to make using various columns in my data frame, how come all have the same value reported at the top when I use this function? – Julio Lopez Aug 12 '18 at 02:10
  • make sure your are calling output_plot <-lm_eqn(x,y) with new variables before each plot – Nar Aug 12 '18 at 08:28