A few months back a made a figure, using this data set, that I'm trying to manipulate further but have two issues:
ORIGINAL FIGURE:
[]
1) How do I define point shapes differently than line colour? (so I have two symbol shapes on one line)
My collaborator wants me to change the symbols as the lines cross the vertical hatch line. So that circles become triangles and vice versa. The lines represent the same cohort of animals (with different 'Order' of exposure), so I want them to stay the same. However, their RmTemp flips when they cross the hatched line, so I want the symbol to represent RmTempF (F is for factor, RmTempC is just integer). To change I have tried changing shape = RmTempF. When I do add in RmTempF (to summarySE groupvars= and ggplot shape = ) and run the figure code it tells me "Error: Aesthetics must be either length 1 or the same as the data (28): linetype, x, y, group, shape, colour"
EDIT: I was able to change the symbols by using SamplingTemp (which has 3 levels), not RmTemp (which has 2). This I'm not certain why this works - as there should only be two groups to match to), but it makes the Aesthetics issue go away.
2) I need a better solution to connect over missing data for publication, HOW?
I dropped a set of data from one time point (note missing x-label), but I still want the space there. I created it by still having the time point assignments (T06 - at very bottom of dataset) for each 'Order' grouping, but NA for response variable. You can see from the code I just drew the lines in, but the hatched lines don't match, and in some views they look more pixilated. I've tried the interp1 suggested here, but things go haywire.
EDIT: After a suggestion by aosmith to remove the NAs from RV of interest (which I couldn't do bc that was how I was making my gap for missing data). I replaced the NA's with the average of the two points I was trying to connect. I also had to make two rows of data for reach point I wanted to replaced to the summarySE yielded NA. THEN for geom_error bar and geom_point I did data=HNormal[-c(6,19),] to remove the rows where I didn't want a datapoint or error bar, yay! prettier graph.
UPDATED FIGURE: Updated Figure
OK so here is my UPDATED code for the figure with notes (then below for full disclosure/to make things work the same as it does for me, I'm attaching code for calculating SE):
HNormal <- summarySE(Heatwave, measurevar="PropNormal", groupvars=c("Order","TimeCode","SamplingTempF"),na.rm=TRUE) #Need to run code for summarySE for this to work!!
HNormalgg <- ggplot(HNormal, aes(x=TimeCode, y=PropNormal, group=Order))
pd <- position_dodge(0.2) # to jitter them .2 to the left and right
HNormalgg +
theme_bw() + #gets rid of grey background / inverts grey and white.
annotate("rect", xmin=0, xmax=1, ymin=0.5, ymax=0.92, fill="grey", alpha=0.2,) + #add shaded bar1
annotate("rect", xmin=5.2, xmax=7, ymin=0.5, ymax=0.92, fill="grey", alpha=0.2) + #add shaded bar2
annotate("rect", xmin=11.2, xmax=13.5, ymin=0.5, ymax=0.92, fill="grey", alpha=0.2) + #add shaded bar3
geom_errorbar(aes(ymin=PropNormal-se, ymax=PropNormal+se), width=.25,position=pd) + #se bars
geom_line(aes(linetype=c(rep("dashed",13),rep("solid",13))), position=pd) + #lines (group=Order) and jitter
geom_point(data=HNormal[-c(6,19),], aes(shape=SamplingTempF),size = 1.75,position=pd) + # point size and jitter
geom_vline(xintercept=7, colour= "grey45", linetype="longdash") + #verticle hatch line seperating parts
scale_y_continuous("Proportion normal", limits=c(0.5,0.92), expand=c(0,0)) + #label
scale_x_discrete("Sampling time point ", labels = c("T01"="Day1","T02"="Day3","T03"="Day7","T04"="Day11","T05"="Day14", "T06"="","T07"="Day26/Day1", "T08"="Day3","T09"="Day7","T10"="Day11","T11"="Day14","T12"="Day21","T13"="Day26")) + #relabel x-axis
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) + #tilt the buggers
scale_colour_manual(values=c("black","#666666"))+
theme(legend.position = "none")
Code for summary SE I use for figures, just in case that is part of the problem:
summarySE <- function(data=NULL, measurevar, groupvars=NULL, na.rm=FALSE,
conf.interval=.95, .drop=TRUE) {
library(plyr)
# New version of length which can handle NA's: if na.rm==T, don't count them
length2 <- function (x, na.rm=FALSE) {
if (na.rm) sum(!is.na(x))
else length(x)
}
# This does the summary. For each group's data frame, return a vector with
# N, mean, and sd
datac <- ddply(data, groupvars, .drop=.drop,
.fun = function(xx, col) {
c(N = length2(xx[[col]], na.rm=na.rm),
mean = mean (xx[[col]], na.rm=na.rm),
sd = sd (xx[[col]], na.rm=na.rm)
)
},
measurevar
)
# Rename the "mean" column
datac <- rename(datac, c("mean" = measurevar))
datac$se <- datac$sd / sqrt(datac$N) # Calculate standard error of the mean
# Confidence interval multiplier for standard error
# Calculate t-statistic for confidence interval:
# e.g., if conf.interval is .95, use .975 (above/below), and use df=N-1
ciMult <- qt(conf.interval/2 + .5, datac$N-1)
datac$ci <- datac$se * ciMult
return(datac)
}