I have 30 years of data that I want to use to plot daily means.
This is basically the graph that I want, but I don't think I'm working with the data in the most efficient way. As I just learned from a previous question, I end up with some data format issues and the 29 February data is difficult.
I am currently working with three columns of dates (date
, ddmm
, daymo
) that have slightly different formatting.
> str(SFdischg)
'data.frame': 11932 obs. of 5 variables:
$ date : Factor w/ 11932 levels "1/01/1985","1/01/1986",..: 97 4409 8697 9677 10069 10461 10853 11245 11637 489 ...
$ ddmm : Factor w/ 366 levels "01-Apr","01-Aug",..: 1 13 25 37 49 61 73 85 97 109 ...
$ year : int 1984 1984 1984 1984 1984 1984 1984 1984 1984 1984 ...
$ cfs : int 1500 1430 1500 1850 1810 1830 1850 1880 1970 1980 ...
$ daymo: Date, format: "2018-04-01" "2018-04-02" "2018-04-03" ...
You can see that daymo
is all the same year. I created it using SFdischg$ddmm[is.na(as.Date(SFdischg$ddmm, format="%d-%b"))]
, but it means that there are NAs when there is data from previous years with 29 Feb. I can omit this data, but this whole step seems a bit silly and cumbersome.
Instead of jumping through all of these hoops to get the plot I want, how can I use date
instead of daymo
to create the long-term mean and "current" year plot? I really only want to use two columns of data: date
and cfs
.
Here is how I created the plot:
SF_pre <- subset(SFdischg, year <2016)
SF_year <- subset(SFdischg, year == 2016)
ggplot(SF_pre, aes(x=daymo, y=cfs)) +
stat_summary(fun.data = "mean_cl_boot", geom = "smooth", colour = "blue", se = TRUE) +
geom_line(data=SF_year, aes(x=daymo, y=cfs), size=1) +
# legend
annotate("text", x=as.Date(dmy("12/2/2018")), y=3850, label = "30 year mean") +
annotate("segment",
x=as.Date(dmy("1/1/2018")), xend = as.Date(dmy("15/1/2018")),
y=3850, yend = 3850,
colour = "blue", size = 1) +
annotate("text", x=as.Date(dmy("26/1/2018")), y=3650, label = "2016") +
annotate("segment",
x=as.Date(dmy("1/1/2018")), xend = as.Date(dmy("15/1/2018")),
y=3650, yend = 3650,
colour = "black", size = 1) +
labs(x="date",y="discharge (cfs)") +
scale_x_date(date_breaks = "1 month",
date_labels = "%b") +
theme(plot.title = element_text(color="#666666", face="bold", size=24, hjust=0)) +
theme(axis.title = element_text(color="#666666", face="bold", size=16)) +
theme_bw()
edited to add data
year cfs daymo date ddmm
1 1984 1500 2018-04-01 1/04/1984 01-Apr
2 1984 1430 2018-04-02 2/04/1984 02-Apr
3 1984 1500 2018-04-03 3/04/1984 03-Apr
4 1984 1850 2018-04-04 4/04/1984 04-Apr
5 1984 1810 2018-04-05 5/04/1984 05-Apr
6 1984 1830 2018-04-06 6/04/1984 06-Apr