-2

The following code produces three plots. The first plot uses data from df_fault, and plots lines with symbols from df_maint, and that plot is fine also. The problem is with the 3rd plot, that combines the lines with symbols from df_fault with the lines from df_maint. The legend is incorrect, and there are two legends, one for lines and one for symbols. How to get one correct legend with four entries.

Create some sample data

library(zoo)
library(ggplot2)

rDates <- function(N, st="2012/01/01", et="2012/12/31") {
  st <- as.POSIXct(as.Date(st))
  et <- as.POSIXct(as.Date(et))
  dt <- as.numeric(difftime(et,st,unit="sec"))
  ev <- sort(runif(N, 0, dt))
  rt <- st + ev
}

first_maint <- as.POSIXct(strptime("2014/01/01", "%Y/%m/%d")) 
last_maint <- as.POSIXct(strptime("2014/12/31", "%Y/%m/%d")) 

first_fault <- as.POSIXct(strptime("2014/05/01", "%Y/%m/%d")) 
last_fault <- as.POSIXct(strptime("2014/07/31", "%Y/%m/%d")) 

set.seed(31)
nMDates=40
nFDates=10
rMaintDates <- rDates(nMDates,first_maint,last_maint)
rFaultDates <- rDates(nFDates,first_fault,last_fault)

df_fault <- data.frame(date = rFaultDates, 
                       type = "Non-Op",
                       ci = runif(nFDates,.7,1.8),stringsAsFactors=FALSE)
df_fault$type[sample(1:nFDates,3)] = "Advisory"


z_hr <- zoo(c(0,0,9.9,9.9),c(first_maint,first_fault,last_fault,last_maint))
z_maint <- zoo(,rMaintDates[c(-1,-nMDates)])
z_hr_maint_a <- merge(z_hr,z_maint)
z_hr_maint <- na.approx(z_hr_maint_a)
z_repair <- zoo(c(0,3000,5000,8000),c(first_maint,first_fault,last_fault,last_maint))
z_repair_maint_a <- merge(z_repair,z_maint)
z_repair_maint <- na.approx(z_repair_maint_a)

df_maint <- data.frame(date=index(z_hr_maint),
                       hrs=coredata(z_hr_maint)/9.8,
                       repairs=coredata(z_repair_maint)/8000)

Plot the sample data, these examples work

rpr_title = "repairs/8000"
flt_title = "hrs/9.8"
(gp2 <- ggplot(data=df_fault,aes(x=date, y=ci, color=type)) +
   labs(x="Date (2014)", y="CI Amplitude",title="Sample, this plot is fine, df_fault") +
   geom_line(aes(group=type,shape=type))+ 
   geom_point(aes(group=type,shape=type),size=4)+ 
   theme(plot.title=element_text( size=12),
         axis.title=element_text( size=8)) ) 
(gp2a <- ggplot() + geom_line(data=df_maint,aes(x=date,y=repairs,color=rpr_title))+
  geom_line(data=df_maint,aes(x=date,y=hrs,color=flt_title))+
   labs(x="Date (2014)", y="CI Amplitude",title="Sample, this plot is fine, df_maint ") 
 )

This plot shows the fault data Plot of fault data

This plot shows the maintenance and usage data enter image description here

I would like to combine the above two plots into one plot, with four legend entries. Here is my current attempt, but the legend isn't correct

(gp2b <- gp2 + geom_line(data=df_maint,aes(x=date,y=repairs,color=rpr_title))+
   geom_line(data=df_maint,aes(x=date,y=hrs,color=flt_title))+
   labs(x="Date (2014)", y="CI Amplitude",title="Sample, this plot the legend is wrong") 
 )

This plot, there are two legends, and neither one is correct. The first "type" legend has the wrong symbols on the line, showing a circle symbol for all the lines. The second "type" legend shows two black symbols, so the colors are incorrect. I would like the 2nd legend removed, and the 1st legend correctly showing lines and colors. Also, it would be nice if the lines without symbols could be wider. The legend line/symbol for "Advisory" is correct. The legend entry for "Non-op" should have a triangle instead of a circle. The legend entries for "hrs/9.8" and "repairs/8000" should only have a line, no symbol.

enter image description here

Brandon suggestions for using meld helps, but the plot below still doesn't have the legend correct...

names(df_fault)[2:3] <- c("variable","value") # for rbind 
dat <- melt(df_maint, c("date")) # melted
dat <- rbind(dat, df_fault)

p1 <- ggplot(dat, aes(date,value, group = variable, color = variable)) + geom_line()
p1 + geom_point(data = 
                  dat[dat$variable %in% c("Advisory","Non-Op"),], 
                aes(date,value, group = variable, color = variable, shape=variable)) + 
  scale_colour_discrete(name  ="Fleet",
                        breaks=c("hrs", "repairs","Advisory","Non-Op"),
                        labels=c("usage hrs", "maint repairs","Advisory Faults","Non-Op Faults")) +
  scale_shape_discrete(name  ="Fleet",
                       breaks=c("hrs", "repairs","Advisory","Non-Op"),
                       labels=c("usage hrs", "maint repairs","Advisory Faults","Non-Op Faults"),
                       guide = "none")

enter image description here

Post script: I want to mention that it took some effort to apply the above procedure to an actual data set. Here's an summary of the process.

1) Identify the x axis variables, and grouping variables.

2) In the two data frames, rename the x axis variable and group variables to the same names

3) Use melt twice (example only used it once) to generate a melted data frame. Use the x axis and group variables as is.vars. Specify the variable that you want to plot as measure.vars.

3b) Do head on the melted data frames. You need to see the X axis variable names and the grouping variable names, followed by the field variable and values. The field variable has text values corresponding to the different y axis names.

4) Use rbind to combine the two melted dataframes

5) Do head on both steps 3 and 4 so you understand the storage of the data

6) Plot the lines for all the data. Include the modification of the legend title in this step, using + guides(color=guide_legend(title="Fleet")). I don't see this command in the example.

7) Create a subset from the melted data frame of the data that will have symbols. Add the symbols, but don't add the 2nd legend from symbols +scale_shape_discrete(name ="Fleet", guide = "none") in the example.

8) Adjust the legend line symbols using + guides(colour = guide_legend(override.aes = list(shape = c(32,32,16,17))))

9) Once you can see a nominal plot of lines with some symbols and the correct legend, you may need to repeat the above process after sorting the combined melted data frame in order to get the correct lines / symbols in front. You may want to sort on variable, and the x axis fields.

  • It's not good form to just dump code and ask people to "fix" it. What exactly is the problem you are having? What are the shortcoming of your own code? What can't you figure out how to do exactly? Include some data in a [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) format. Add pictures with illustrations to show exactly what you want to accomplish. – MrFlick Oct 10 '14 at 20:43
  • That is waaaaayyyyy too much code. Do you expect people to go through that line by line? I sure wouldn't do it unless I was earning a paycheck. Furthermore, it's actually off-topic to dump code and ask people to fix it. – Rich Scriven Oct 10 '14 at 23:00
  • I have corrected the sample data / example. If you run the code, it generates three figures, the 3rd one is the one that is trying to be made correctly. At this point, the only problem is in the legend. –  Oct 10 '14 at 23:05
  • I agree that question should be well posed. I had to go to a meeting, and I "thought" the example would work, but the time interpolation was not correct. In the actual data, I don't have to interpolate the time, but for this example to be representative, there needed to be time interpolation since the two data frames are at two different samplings. Can someone explain how to upload figures to a question. I don't know how to do that. –  Oct 10 '14 at 23:12
  • On the toolbar above the edit window is an `image` icon. It's in the second set of icons. – Brandon Bertelsen Oct 11 '14 at 20:50
  • Thanks, I added figures, with description of what is correct and incorrect. –  Oct 11 '14 at 21:10

2 Answers2

2

By adding guides, and specifying the shape as no shape (32), and matching the other symbols (16, 17), the plot comes out correct

p1 <- ggplot(dat, aes(date,value, group = variable, color = variable)) + geom_line(size=1)
p1 + geom_point(data = 
                  dat[dat$variable %in% c("Advisory","Non-Op"),], 
                aes(date,value, group = variable, color = variable, shape=variable),size=3) + 
  scale_colour_discrete(name  ="Fleet",
                        breaks=c("hrs", "repairs","Advisory","Non-Op"),
                        labels=c("usage hrs", "maint repairs","Advisory Faults","Non-Op Faults")) +
  scale_shape_discrete(name  ="Fleet",
                       guide = "none") +
  guides(colour = guide_legend(override.aes = list(shape = c(32,32,16,17))))

enter image description here

  • +1 for an example of override.aes, I've never seen that before. – Brandon Bertelsen Oct 11 '14 at 22:55
  • At this point, after applying the procedure to actual data, I think there may be a simpler approach to plotting data from different frames. After combining the frames using melt/rbind, add columns for grouping, color, and symbol type, using symbol 31 for lines without symbols. Also, it would be a good idea to sort the melted data prior to plotting. –  Oct 14 '14 at 14:11
0

When in doubt, melt. See the example below:

library(reshape2)
library(ggplot2) 
names(df_fault)[2:3] <- c("variable","value") # for rbind 
dat <- melt(df_maint, c("date")) # melted
dat <- rbind(dat, df_fault)

p1 <- ggplot(dat, aes(date,value, group = variable, color = variable)) + geom_line()
p1 + geom_point(data = 
    dat[dat$variable %in% c("Advisory","Non-Op"),], 
    aes(date,value, group = variable, color = variable, shape=variable)) + 
scale_shape(guide = "none")

Notice that I specified "data" in my geom_point() call. Each scale_ has a method for removing the guide by setting it to "none".

Brandon Bertelsen
  • 43,807
  • 34
  • 160
  • 255
  • Now, how to add those points? + geom_point(dat[dat$variable %in% c("Advisory","Non-Op"),], aes(date,value, group = variable, color = variable, shape=variable)) # doesn't work –  Oct 11 '14 at 21:20
  • I tried scale_shape, and scale_discrete_shape, with guide = "none", but the legend still has symbols on the added lines. –  Oct 11 '14 at 22:03