I have data in a CSV file like this:
Year,A,B,C,D,E,F,G
2007,3.35,,,,,,
2008,3.54,3.59,,,,,
2009,3.22,3.46,4.43,,,,
2010,3.82,3.63,4.64,,,,
2011,2.91,3.74,4.5,4.13,4.38,,
2012,3.85,3.57,4.13,4,4,4,
2013,4.33,2.93,4.63,4.71,4.25,,
2014,4.73,4,4.81,4.66,4.33,,4
2015,,,4.89,4.68,,,
I'm trying to plot it like this:
scores_raw = read.csv("scores.csv", header = TRUE, fill = TRUE)
scores_melt <- melt(scores_raw, id = "Year")
scores_symb <- c(15, 17, 16, 16, 16, 16, 16)
plot_scores <- ggplot(scores_melt, aes(x=Year, y=value, colour=variable, shape=variable))
plot_scores +
geom_line() +
geom_point(size = 10, alpha = 0.6) +
scale_shape_manual(values = scores_symb,
name="Cohort\nSize",
labels=c("200", "100", "25")) +
ylab("Score (5 = max)") +
scale_y_continuous(limits = c(0, 5)) +
theme_bw() +
theme(
text = element_text(size=30)
, axis.title.y=element_text(vjust=1.5)
, axis.title.x=element_text(vjust=0.1)
, plot.background = element_rect(fill = "transparent",colour = NA)
, legend.justification=c(0,0), legend.position=c(0,0) #legend.position="none"
, legend.background = element_rect(fill="transparent", size=.5, linetype="dotted")
)
As you can tell, I've got 7 series but only want to tell them apart by 3 cohort sizes (i.e. shape).
I would like the legend to only show the three shapes that discriminate the three types of data I've got. At the moment, I can either produce a single legend with shapes and colours combined. Or two legends (as in the code above) that produces two legends, one with shapes (4 of which are NAs) and the other with the colours.
Help please!