1

I had an old function that worked like a charm:

lm_eqn = function(m) {
    
    l <- list(a = format(coef(m)[1], digits = 2),
    b = format(abs(coef(m)[2]), digits = 2),
    r2 = format(summary(m)$r.squared, digits = 3));
    
        eq <- substitute(italic(C)[i] == a + b %.% italic(I)[i]*","~~italic(r)^2~"="~r2,l)
  
    
    as.character(as.expression(eq));
}

where m was an lm model. This would produce an equation like the following:

y = 0.3 + 4.4x, r = 0.67

which could then be used in a ggplot to show the model formula with its graph. The problem is that the same equation now incorporates uncalled for symbols:

y = c(0.3) + c(4.4)x, r=0.67

The concatenated c() is now included for each variable from the list I am accruing - and I don't know why. Does anyone know how to

a) prevent this, or b) correct it?

Note: the problem seems to emerge in substitution, the output of eq is:

"italic(y) == c(`(Intercept)` = \"0.3\") + c(x = \"4.4\") %.% italic(x) * \",\" ~ ~italic(r)^2 ~ \"=\" ~ \"0.67\""

It looks like substitute's output includes the c() for the intercept and slope.

edit

m in this case is a generic lm element. For example

x <- c(5,3,6,8,2,6)
y <- c(2,6,3,7,4,9)
test.lm <- lm(y~x)

lm_eqn(test.lm)
[1] "italic(C)[i] == c(`(Intercept)` = \"3.3\") + c(x = \"0.37\") %.% italic(I)[i] * \",\" ~ ~italic(r)^2 ~ \"=\" ~ \"0.0969\""
Community
  • 1
  • 1
Lee Drake
  • 35
  • 5

1 Answers1

1

You apparently need to unname the coef() values:

lm_eqn = function(m) {

    l <- list(a = format(unname(coef(m))[1], digits = 2),
               b = format(abs(unname(coef(m))[2]), digits = 2),
               r2 = format(summary(m)$r.squared, digits = 3));
        eq <- bquote( italic(C)[i] == .(l$a) + .(l$b) %.% italic(I)[i]*","~~italic(r)^2~"="~.(l$r2))
        as.character(as.expression(eq));
}

I also think you need to clarify exactly what you are hoping to see. At the moment you are creating an expression vector with two elements and then you are converting that to a character. The fact that ggplot requires character values for its "expressions" makes it quite difficult to look at a character value and figure out what will be displayed, so you should probably expand your test code to include that manner in which this value will be delivered. (It's much easier to look at a real R expression.) I think there are mechanisms that allow unevaluated expressions to be passed to ggplot annotations and titles but they seem incredibly convoluted to my eyes.

Could also use substitute which requires specifying a list that has named elements.

lm_eqn = function(m) {

    l <- list(a = format(unname(coef(m))[1], digits = 2),
               b = format(abs(unname(coef(m))[2]), digits = 2),
               r2 = format(summary(m)$r.squared, digits = 3));
        eq <- substitute( italic(C)[i] == a + b %.% italic(I)[i]*","~~italic(r)^2 == r2, env=l) )
        as.character(as.expression(eq));
}

lm_eqn(test.lm)
[1] "italic(C)[i] == \"3.3\" + \"0.37\" %.% italic(I)[i] * \",\" ~ ~italic(r)^2 == \"0.0969\""
IRTFM
  • 258,963
  • 21
  • 364
  • 487