The following code produces three plots. The first plot uses data from df_fault, and plots lines with symbols from df_maint, and that plot is fine also. The problem is with the 3rd plot, that combines the lines with symbols from df_fault with the lines from df_maint. The legend is incorrect, and there are two legends, one for lines and one for symbols. How to get one correct legend with four entries.
Create some sample data
library(zoo)
library(ggplot2)
rDates <- function(N, st="2012/01/01", et="2012/12/31") {
st <- as.POSIXct(as.Date(st))
et <- as.POSIXct(as.Date(et))
dt <- as.numeric(difftime(et,st,unit="sec"))
ev <- sort(runif(N, 0, dt))
rt <- st + ev
}
first_maint <- as.POSIXct(strptime("2014/01/01", "%Y/%m/%d"))
last_maint <- as.POSIXct(strptime("2014/12/31", "%Y/%m/%d"))
first_fault <- as.POSIXct(strptime("2014/05/01", "%Y/%m/%d"))
last_fault <- as.POSIXct(strptime("2014/07/31", "%Y/%m/%d"))
set.seed(31)
nMDates=40
nFDates=10
rMaintDates <- rDates(nMDates,first_maint,last_maint)
rFaultDates <- rDates(nFDates,first_fault,last_fault)
df_fault <- data.frame(date = rFaultDates,
type = "Non-Op",
ci = runif(nFDates,.7,1.8),stringsAsFactors=FALSE)
df_fault$type[sample(1:nFDates,3)] = "Advisory"
z_hr <- zoo(c(0,0,9.9,9.9),c(first_maint,first_fault,last_fault,last_maint))
z_maint <- zoo(,rMaintDates[c(-1,-nMDates)])
z_hr_maint_a <- merge(z_hr,z_maint)
z_hr_maint <- na.approx(z_hr_maint_a)
z_repair <- zoo(c(0,3000,5000,8000),c(first_maint,first_fault,last_fault,last_maint))
z_repair_maint_a <- merge(z_repair,z_maint)
z_repair_maint <- na.approx(z_repair_maint_a)
df_maint <- data.frame(date=index(z_hr_maint),
hrs=coredata(z_hr_maint)/9.8,
repairs=coredata(z_repair_maint)/8000)
Plot the sample data, these examples work
rpr_title = "repairs/8000"
flt_title = "hrs/9.8"
(gp2 <- ggplot(data=df_fault,aes(x=date, y=ci, color=type)) +
labs(x="Date (2014)", y="CI Amplitude",title="Sample, this plot is fine, df_fault") +
geom_line(aes(group=type,shape=type))+
geom_point(aes(group=type,shape=type),size=4)+
theme(plot.title=element_text( size=12),
axis.title=element_text( size=8)) )
(gp2a <- ggplot() + geom_line(data=df_maint,aes(x=date,y=repairs,color=rpr_title))+
geom_line(data=df_maint,aes(x=date,y=hrs,color=flt_title))+
labs(x="Date (2014)", y="CI Amplitude",title="Sample, this plot is fine, df_maint ")
)
This plot shows the fault data
This plot shows the maintenance and usage data
I would like to combine the above two plots into one plot, with four legend entries. Here is my current attempt, but the legend isn't correct
(gp2b <- gp2 + geom_line(data=df_maint,aes(x=date,y=repairs,color=rpr_title))+
geom_line(data=df_maint,aes(x=date,y=hrs,color=flt_title))+
labs(x="Date (2014)", y="CI Amplitude",title="Sample, this plot the legend is wrong")
)
This plot, there are two legends, and neither one is correct. The first "type" legend has the wrong symbols on the line, showing a circle symbol for all the lines. The second "type" legend shows two black symbols, so the colors are incorrect. I would like the 2nd legend removed, and the 1st legend correctly showing lines and colors. Also, it would be nice if the lines without symbols could be wider. The legend line/symbol for "Advisory" is correct. The legend entry for "Non-op" should have a triangle instead of a circle. The legend entries for "hrs/9.8" and "repairs/8000" should only have a line, no symbol.
Brandon suggestions for using meld helps, but the plot below still doesn't have the legend correct...
names(df_fault)[2:3] <- c("variable","value") # for rbind
dat <- melt(df_maint, c("date")) # melted
dat <- rbind(dat, df_fault)
p1 <- ggplot(dat, aes(date,value, group = variable, color = variable)) + geom_line()
p1 + geom_point(data =
dat[dat$variable %in% c("Advisory","Non-Op"),],
aes(date,value, group = variable, color = variable, shape=variable)) +
scale_colour_discrete(name ="Fleet",
breaks=c("hrs", "repairs","Advisory","Non-Op"),
labels=c("usage hrs", "maint repairs","Advisory Faults","Non-Op Faults")) +
scale_shape_discrete(name ="Fleet",
breaks=c("hrs", "repairs","Advisory","Non-Op"),
labels=c("usage hrs", "maint repairs","Advisory Faults","Non-Op Faults"),
guide = "none")
Post script: I want to mention that it took some effort to apply the above procedure to an actual data set. Here's an summary of the process.
1) Identify the x axis variables, and grouping variables.
2) In the two data frames, rename the x axis variable and group variables to the same names
3) Use melt twice (example only used it once) to generate a melted data frame. Use the x axis and group variables as is.vars. Specify the variable that you want to plot as measure.vars.
3b) Do head on the melted data frames. You need to see the X axis variable names and the grouping variable names, followed by the field variable and values. The field variable has text values corresponding to the different y axis names.
4) Use rbind to combine the two melted dataframes
5) Do head on both steps 3 and 4 so you understand the storage of the data
6) Plot the lines for all the data. Include the modification of the legend title in this step, using + guides(color=guide_legend(title="Fleet")). I don't see this command in the example.
7) Create a subset from the melted data frame of the data that will have symbols. Add the symbols, but don't add the 2nd legend from symbols +scale_shape_discrete(name ="Fleet", guide = "none") in the example.
8) Adjust the legend line symbols using + guides(colour = guide_legend(override.aes = list(shape = c(32,32,16,17))))
9) Once you can see a nominal plot of lines with some symbols and the correct legend, you may need to repeat the above process after sorting the combined melted data frame in order to get the correct lines / symbols in front. You may want to sort on variable, and the x axis fields.