1

I saw this answer from Jayden a while ago about adding regression equation to a plot, which I found very useful. But I don't want to display R^2, so I changed the code a bit to this:

lm_eqn = function(m) {
l <- list(a = format(coef(m)[1], digits = 2),
  b = format(abs(coef(m)[2]), digits = 2));
if (coef(m)[2] >= 0)  {
eq <- substitute(italic(y) == a + b %.% italic(x))
} else {
eq <- substitute(italic(y) == a - b %.% italic(x))    
}
as.character(as.expression(eq));                 
}

This managed to plot "a+bx" or "a-bx" to the plot, but without actual coefficients replacing a and b. Does anyone know how to fix the problem? Thanks very much!

Jayden's answer:

 lm_eqn = function(m) {
 l <- list(a = format(coef(m)[1], digits = 2),
  b = format(abs(coef(m)[2]), digits = 2),
  r2 = format(summary(m)$r.squared, digits = 3));
 if (coef(m)[2] >= 0)  {
 eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2,l)
 } else {
 eq <- substitute(italic(y) == a - b %.% italic(x)*","~~italic(r)^2~"="~r2,l)    
 }
 as.character(as.expression(eq));                 
 }
Bearbear
  • 13
  • 3

1 Answers1

2

It looks like you are missing the l in substitute(). That is, use substitute(yourFormula, l). Here's a MWE without the r^2 that parallels the one you're looking at (which I think is at Adding Regression Line Equation and R2 on graph).

library(ggplot2)

# Function to generate correlated data.
GenCorrData = function(mu, Sig, n = 1000) {
  U            <- chol(Sig)
  Z            <- matrix(rnorm(n*length(mu)), nrow = length(mu))
  Y            <- crossprod(U,Z) + mu
  Y            <- as.data.frame(t(Y))
  names(Y)     <- c("x", "y")
  return(Y)
}

# Function to add text
LinEqn = function(m) {
  l <- list(a = format(coef(m)[1], digits = 2),
            b = format(abs(coef(m)[2]), digits = 2));
  if (coef(m)[2] >= 0) {
    eq <- substitute(italic(y) == a + b %.% italic(x),l)
  } else {
    eq <- substitute(italic(y) == a - b %.% italic(x),l)    
  }
  as.character(as.expression(eq));                 
}

# Example
set.seed(700)
n1             <- 1000
mu1            <- c(4, 5)
Sig1           <- matrix(c(1, .8, .8, 1), nrow = length(mu1))
df1            <- GenCorrData(mu1, Sig1, n1)
scatter1       <- ggplot(data = df1, aes(x, y)) +
                    geom_point(shape = 21, color = "blue", size = 3.5) +
                    scale_x_continuous(expand = c(0, 0), limits = c(0, 8)) +
                    scale_y_continuous(expand = c(0, 0), limits = c(0, 8))
scatter.line1  <- scatter1 + 
                    geom_smooth(method = "lm", formula = y ~ x, se = FALSE, 
                                color="black", size = 1) +
                    annotate("text", x = 2, y = 7, color = "black", size = 5,
                             label = LinEqn(lm(y ~ x, df1)), parse = TRUE)
scatter.line1

Regression equation

Community
  • 1
  • 1
Pat W.
  • 1,801
  • 2
  • 26
  • 36
  • Thanks. I added 1, but then got the error message as below: Error in substitute(italic(y) == a - b %.% italic(x), 1) : invalid environment specified. How can I fix this? – Bearbear Nov 24 '14 at 19:55
  • @Bearbear, `substitute()` is looking for the name of your list as its second argument. Didn't you call it `l` and not `1`? – Pat W. Nov 24 '14 at 21:10
  • W, I made a mistake by writing "l" as the number "1". Now it works, thank you very much. – Bearbear Nov 24 '14 at 21:58