I'd think this would be fairly fundamental but can't find how to do it in any introductory texts that I have nor by googling. I want to plot mean of a continuous variable by a categorical variable and then group by a factor. The continuous variable is 'cd' (blood cd4 protein), the categorical is year (1 - 10 years), the factor is failure = 0 or 1. My dataset is 'F3'
I've used aggregate to get the mean cd by year, but can't find how to group that by failure (0,1) for no and yes. Would prefer to use ggplot.
The plot I get from this:
ggplot(F3, aes(factor(year), mean(cd), color = factor(failure))) +
geom_line() +
geom_point(size=2)
is a horizontal line or two lines overlaid, but indicating a group by failure in a legend. So, it's not plotting the mean cd by year, just the overall mean. Please help.
Data:
F3 <- structure(list(year = structure(c(6L, 7L, 8L, 9L, 10L, 1L, 2L,
3L, 4L, 5L, 6L), .Label = c("1", "2", "3", "4", "5", "6", "7",
"8", "9", "10"), class = "factor"), cd = c(555L, 511L, 540L,
596L, 553L, 142L, 173L, 271L, 163L, 108L, 61L), failure = structure(c(1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("0", "1"), class = "factor")), .Names = c("year",
"cd", "failure"), row.names = c("1", "2", "3", "4", "5", "6",
"7", "8", "9", "10", "11"), class = "data.frame")