I have a dataset that looks like this (but with more years of data):
dat <- data.frame(date = as.Date(c("2000-01-01","2000-03-31","2000-07-01","2000-09-30",
"2001-01-01","2001-03-31","2001-07-01","2001-09-30")),
value = c(0.8,1,0.2,0,0.7,1,0.2,0))
I would like to choose the first instance that "value" is >= 0.8 for each year.
So for the above dataset, I would expect the output to be a data frame with two rows and two columns:
new_dat <- data.frame(date = as.Date(c("2000-01-01", "2001-03-31")),
value = c(0.8,0.7))
print(new_dat)
I have been trying to accomplish this using dplyr:
dat_grouped <- dat %>%
mutate(year = year(date))%>%
group_by(year) %>%
distinct(value >= 0.8, date = date) #wanted to keep the date column
It gives me TRUE FALSE values for the "value" column, but I can't seem to find a good way to select the first TRUE value. I've tried wrapping distinct() with first() and I've tried piping to which.min(), but neither worked.
I found this entry, but I was hoping for a tidy solution. I'm also having an issue adapting that code to my dataset. I get " Error in apply(x, 2, my.first) : dim(X) must have a positive length "
I would also like to perform the same request but for the first occasion that value <= 0.2. But I assume it would be the same process with a different logical request. Perhaps the logical operator is not the way to go?
Any suggestions are greatly appreciated. Thank you.