I have the following R-code, where I wish for the beta_i in the legends to be actual greek-letter-betas. Please ignore the dansih comments. The code is supposed to show the solution path of a ridge regression. The actual code is much longer, with several plots with the same issue.
library(latex2exp)
library(glmnet)
library(MASS)
library(ggplot2)
library(reshape)
library(gridExtra)
set.seed(10)
Y = rnorm(100)
Y = scale(Y)
X=matrix(rnorm(100*8),ncol=8)
X = scale(X)
fitR = glmnet(X,Y, alpha = 0)
beta = coef(fitR)
temp = as.data.frame(as.matrix(beta)) #Laver til dataframe
temp$coef = row.names(temp) #Danner ny kolonne med koefficientnavne
temp = temp[temp$coef != "(Intercept)",] #Fjerner interceptet, der er 0, da normaliseret.
temp = reshape::melt(temp, id = "coef") #Slår de 100 tabeller sammen
temp$variable = as.numeric(gsub("s", "", temp$variable)) #Omdøber variabelnavne
temp$lambda = fitR$lambda[temp$variable+1] #Henter lambdaer
temp$coef = paste("beta_", gsub("V", "", temp$coef), sep="")
plot1 = ggplot(temp, aes(lambda, value, color = coef)) +
xlim(0,75) +
geom_line() +
ggtitle(TeX("Ridge estimater mod $\\lambda$"))+
xlab(TeX("$\\lambda$")) + ylab("Estimat")+
guides(color = guide_legend(title = "")) +
theme_bw() +
theme(legend.key.width = unit(3,"lines"))
grid.arrange(plot1)
The important vector, temp$coef
is a vector consisting of 500 values of beta_i for i=1,...8. I have tried without luck to write:
ggplot(temp, aes(lambda, value, color = paste('TeX("$\\', coef, '$")', sep=''))
but this results in an error: "Fejl: Cannot add ggproto objects together. Did you forget to add this object to a ggplot object?".
Inspired by this and this post, I replaced the line
guides(color = guide_legend(title = "")) +
with
scale_color_discrete(labels = parse(text= paste("beta[", 1:8, "]", sep=""))) +
which does fix my problem. However I have two problems with this. First of all I end up using non-LaTeX notation "beta[i]" instead of the latex-style "beta_i", when I have used LaTeX in the rest of the code. Second of all this only works because in my case all entrances in temp$coef consists of "beta_i". If these 8 entrances were e.g.
temp$coef = c("alpha_1", "beta_2", ..., "theta_8")
then I would not be able to do the same.
So my question is this: Given a vector of expressions suitable for latex (e.g. c(alpha_1, ..., theta_8)), is there a way to build a legend in a ggplot using the names of this vector?
As this is my first post here, please let me know, if I need to change anything.
Edit
based on the comments by user2554330 I have tried using:
scale_color_discrete(labels = TeX(temp$coef)) +
which doesn't give any errors, but it doesn't show any names in the legend.
Using $...$
around the temp$coef
gives the error:
Fejl: uventet '$' in:" xlab(TeX("$\\lambda$")) + ylab("Estimat")+ scale_color_discrete(labels = TeX($"
Writing scale_color_discrete(labels = TeX(\\temp$coef)) +
gives a similar error.
I've also tried using:
scale_color_discrete(labels = TeX(paste('$\\', unique(temp$coef), '$', sep=''))) +
but this just writes the non-greek beta_1, ..., beta_8 in the legend.
Finnally writing:
scale_color_discrete(labels = TeX(unique(temp$coef)))
achieves half the goal. In the legend it writes beta_i, where i is actually a subscript.