0

I have 30 years of data that I want to use to plot daily means.

30 year hydrograph

This is basically the graph that I want, but I don't think I'm working with the data in the most efficient way. As I just learned from a previous question, I end up with some data format issues and the 29 February data is difficult.

I am currently working with three columns of dates (date, ddmm, daymo) that have slightly different formatting.

> str(SFdischg)
'data.frame':   11932 obs. of  5 variables:
 $ date : Factor w/ 11932 levels "1/01/1985","1/01/1986",..: 97 4409 8697 9677 10069 10461 10853 11245 11637 489 ...
 $ ddmm : Factor w/ 366 levels "01-Apr","01-Aug",..: 1 13 25 37 49 61 73 85 97 109 ...
 $ year : int  1984 1984 1984 1984 1984 1984 1984 1984 1984 1984 ...
 $ cfs  : int  1500 1430 1500 1850 1810 1830 1850 1880 1970 1980 ...
 $ daymo: Date, format: "2018-04-01" "2018-04-02" "2018-04-03" ...

You can see that daymois all the same year. I created it using SFdischg$ddmm[is.na(as.Date(SFdischg$ddmm, format="%d-%b"))], but it means that there are NAs when there is data from previous years with 29 Feb. I can omit this data, but this whole step seems a bit silly and cumbersome.

Instead of jumping through all of these hoops to get the plot I want, how can I use date instead of daymo to create the long-term mean and "current" year plot? I really only want to use two columns of data: date and cfs.

Here is how I created the plot:

SF_pre <- subset(SFdischg, year <2016)
SF_year <- subset(SFdischg, year == 2016)

ggplot(SF_pre, aes(x=daymo, y=cfs)) +
stat_summary(fun.data = "mean_cl_boot", geom = "smooth", colour = "blue", se = TRUE) +
  geom_line(data=SF_year, aes(x=daymo, y=cfs), size=1) +
  # legend
  annotate("text", x=as.Date(dmy("12/2/2018")), y=3850, label = "30 year mean") +
  annotate("segment", 
           x=as.Date(dmy("1/1/2018")), xend = as.Date(dmy("15/1/2018")), 
           y=3850, yend = 3850, 
           colour = "blue", size = 1) +
  annotate("text", x=as.Date(dmy("26/1/2018")), y=3650, label = "2016") +
  annotate("segment", 
           x=as.Date(dmy("1/1/2018")), xend = as.Date(dmy("15/1/2018")), 
           y=3650, yend = 3650, 
           colour = "black", size = 1) +
  labs(x="date",y="discharge (cfs)") + 
  scale_x_date(date_breaks = "1 month", 
               date_labels = "%b") +
  theme(plot.title = element_text(color="#666666", face="bold", size=24, hjust=0)) +
  theme(axis.title = element_text(color="#666666", face="bold", size=16)) +
  theme_bw()

edited to add data

  year  cfs      daymo      date   ddmm
1 1984 1500 2018-04-01 1/04/1984 01-Apr
2 1984 1430 2018-04-02 2/04/1984 02-Apr
3 1984 1500 2018-04-03 3/04/1984 03-Apr
4 1984 1850 2018-04-04 4/04/1984 04-Apr
5 1984 1810 2018-04-05 5/04/1984 05-Apr
6 1984 1830 2018-04-06 6/04/1984 06-Apr
ayesha
  • 135
  • 15
  • 1
    Could you make your problem reproducible by sharing a sample of your data so others can help (please do not use `str()`, `head()` or screenshot)? You can use the [`reprex`](https://reprex.tidyverse.org/articles/articles/magic-reprex.html) and [`datapasta`](https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html) packages to assist you with that. See also [Help me Help you](https://speakerdeck.com/jennybc/reprex-help-me-help-you?slide=5) & [How to make a great R reproducible example?](https://stackoverflow.com/q/5963269) – Tung Oct 29 '18 at 08:15
  • I thought discharge data should be plotted based on water year? – Tung Oct 29 '18 at 08:18
  • I added a chunk of data at the end of the question - I hope this helps! – ayesha Oct 29 '18 at 10:11
  • And, yes, we could consider plotting based on water year... but this works for the project I'm working on. Thanks! – ayesha Oct 29 '18 at 10:11
  • 1
    The data you provided are not sufficient for the code you posted. Please read the links in my 1st comment – Tung Oct 29 '18 at 15:56

0 Answers0