I'm currently working on automating some basic experiential analysis using R. Currently, I've got my script setup as follows which generates the plot shown below.
data <- list()
for (experiment in experiments) {
path = paste('../out/', experiment, '/', plot, '.csv', sep="")
data[[experiment]] <- read.csv(path, header=F)
}
df <- data.frame(Year=1:40,
'current'=colMeans(data[['current']]),
'vip'=colMeans(data[['vip']]),
'vipbonus'=colMeans(data[['vipbonus']]))
df <- melt(df, id.vars = 'Year', variable.name = 'Series')
plotted <- ggplot(df, aes(Year, value)) +
geom_line(aes(colour = Series)) +
labs(y = ylabel, title = title)
file = paste(plot, '.png', sep="")
ggsave(filename = file, plot = plotted)
While this is close to what we want the final product to look like, the series labels need to be updated. Ideally we want them to be something like "VIP, no bonus", "VIP, with bonus" and so forth, but obviously using labels like that in the data frame is not valid R (and invalid characters are automatically replaced with .
even with backticks). Since these experiments are a work in progress, we also know that we are gong to need more series labels in the future so we don't want to lose the ability of ggplot
to automatically set the colors for us.
How can I set the series labels to be appropriate for humans?