1

I want to calculate the sum of the flooded days for a variable intervall of time. the end of the period is given in a vector or data.frame and I want different length of my time period, e. g. 4 days and 6 days.

How can I create a code, which is flexible, so that I can calculate different date_end's and also create a vector with different length of the period?

My complete data.frame contains about 2 years, and 12 end dates and 3 different period lenght.

df <- data.frame(date = c("2016-11-01", "2016-11-02", "2016-11-03", "2016-11-04", "2016-11-05", "2016-11-06", "2016-11-07", "2016-11-08", "2016-11-09", "2016-11-10"),
           flooded = c(0,0,0,1,1,1,0,0,1,1))
date_end <- as.Date(c("2016-11-04", "2016-11-10"), "%Y-%m-%d")

##lenght of time period, e. g. 4 days
period <- c(4,6)

   date             flooded
1  2016-11-01       0
2  2016-11-02       0
3  2016-11-03       0
4  2016-11-04       1
5  2016-11-05       1
6  2016-11-06       1
7  2016-11-07       0
8  2016-11-08       0
9  2016-11-09       1
10 2016-11-10       1

All in all I want to calculate the flooded days of my obersavtion point. Thank you

Nesch
  • 77
  • 4

1 Answers1

0

You could write a function that calculates the sequences.

floodCount <- function(datecol, floodcol, e, p) {
  e <- as.Date(e)
  datecol <- as.Date(datecol)
  stopifnot(!anyNA(c(e, p)))
  stopifnot((e - p) %in% datecol)
  return(sum(floodcol[which((datecol == e - p + 1)):which(datecol == e)]))
}

Usage on your example data:

with(df, floodCount(date, flooded, date_end[2], period[2]))
# [1] 4

On a larger scale (see data below):

with(df2, floodCount(date, flooded, date.end2[8], period2[3]))
# [1] 2

Or manually

with(df2, floodCount(date, flooded, "2015-11-06", 8))  # oops...
with(df2, floodCount(date, flooded, "2016-11-06", 8))  # oops...
with(df2, floodCount(date, flooded, "2016-11-06", 4))  # ok!
# [1] 3

Update

To calculate all combinations of dates and periods you may Vectorize floodCount and then use outer() over the sequences of the vectors, wrapped into a `dimnames<-`().

floodCountv <- Vectorize(function(x, y) 
  with(df2, floodCount(date, flooded, date.end2[x], period2[y])))

`dimnames<-`(outer(seq_along(date.end2), seq(period2), floodCountv),
         list(as.character(date.end2), period2))
#            4 6 9
# 2017-02-11 2 4 6
# 2017-02-22 3 4 7
# 2017-03-13 4 5 7
# 2017-07-22 2 4 6
# 2017-07-24 2 3 6
# 2017-08-02 2 3 5
# 2017-09-08 1 1 3
# 2017-10-07 1 2 3
# 2018-04-16 1 2 4
# 2018-04-27 3 5 5
# 2018-10-08 3 4 6
# 2018-10-23 2 2 5

Data

set.seed(42)
df2 <- data.frame(date=seq(as.Date("2016-11-01"), as.Date("2018-11-01"), "day"),
                  flooded=rbinom(731, 1, .5))
date.end2 <- sort(sample(df2$date, 12))
period2 <- c(4, 6, 9)
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • Thanks for your solution, it works pretty good. How would be the best way, to modify it, so that the output creats a vector or dataframe with more than one floodcount? In the example with df with 2 periods and 2 end_dates I should get a 2x2 matrix. When I tried it with your solution, I got the error `Error in floodCount(date, flooded, date_end[1:4], period[2]) : !anyNA(c(e, p)) ist nicht TRUE ` Thanks – Nesch Jul 09 '19 at 16:24
  • You need e.g. a `sapply` to use multiple dates, do `sapply(date.end2[1:4], function(x) with(df2, floodCount(date, flooded, x, period2[2])))`. – jay.sf Jul 09 '19 at 16:40
  • My solution to get a matrix is `F_2017 <- sapply(period2, function(p) with(df2, floodCount(date, flooded,date.end2[1], p))) FLOODED.T <- rbind(F_2017, ...) FLOODED.T2 <- as.data.frame(FLOODED.T) names(FLOODED.T2)[1:3] <- period2[1:3]` So I get a dataframe 4 x 3. But now I have the problem, that I have not only one input data, but over 10 flooded plains, e.g. `data.frame(date = c(x,x,x), flooded1 = c(xx), flooded2 = c(yy), flooded3= c(zz))`. How can I create a loop, so that I get a 4x3 matrix for each Floodplain or a big matrix with each 4 x 30 matrix? THX – Nesch Jul 10 '19 at 16:11
  • @Nesch Probably vectorizing is a good way, see update. For multiple plains you probably need to [`reshape`](https://stackoverflow.com/a/2185525/6574038) or [`aggregate`](https://stackoverflow.com/a/32262439/6574038) your data. – jay.sf Jul 10 '19 at 18:11