0

My two data frames are accessible here and here and I have been trying to follow this previous post.

I would like to populate wombat$rainfall_lag_2wk with the sum of rainfall records for the previous two weeks/14 days, this data is available in rain. I have tried to do this a number of ways before I found the above post. Most recently I have tried to follow the above post, but I get the below error.

Any help would be greatly appreciated. I am happy with any solution, whether it follows the same structure as the above post or not.

Thanks in advance

# Load data
wombat <- read.csv("wombat.csv", header = TRUE)
rain <- read.csv("rain.csv", header = TRUE)

# Define dates
wombat$date <- as.Date(wombat$date, "%Y-%m-%d")
rain$Date <- as.Date(rain$Date, "%Y-%m-%d")

# Calculate rainfall for previous two weeks following above link
wombat$start_date <- rep_len("01/01/1970", nrow(wombat))
wombat$start_date <- as.Date(wombat$start_date, "%m/%d/%Y")
wombat$diff_days <- as.numeric(difftime(wombat$date, wombat$start_date, units = "days"))

rain$start_date <- rep_len("01/01/1970", nrow(rain))
rain$start_date <- as.Date(rain$start_date, "%m/%d/%Y")
rain$diff_days <- as.numeric(difftime(rain$Date, rain$start_date, units = "days"))

for (i in 1:length(wombat$diffdays)) { 
  day = wombat$diffdays[i]
  rainday = pmatch(day, rain$diffdays, dup = FALSE)
  wombat$rainfall_lag_2wk[i] = sum(rain$Rainfall.amount..millimetres.[(rainday-14):(rainday-1)])  # 14 days
}

Error after running above Error in (rainday - 14):(rainday - 1) : argument of length 0

Pat Taggart
  • 321
  • 1
  • 9
  • What would your ideal data look like? – Matt Jul 01 '20 at 00:58
  • I am wanting the outcome data set to be the same as ```wombat``` but with the ```wombat$rainfall_lag_2wk``` column filled in, i.e for each date in ```wombat``` I would have a corresponding value in ```wombat$rainfall_lag_2wk``` that indicated the total rainfall in the previous 14 days with total rainfall data being taken from ```rain$Rainfall.amount..millimetres.```. The final data set should be same length as ```wombat``` and otherwise include same data as ```wombat``` with adition of extra column ```rainfall_lag_2wk```. Thanks – Pat Taggart Jul 01 '20 at 02:40

2 Answers2

1

I'm not sure what your final data should look like, so I'm assuming that you want to see the cumulative rainfall for the previous 14 days in the wombat data.

Here's a solution using the tidyverse and zoo packages.

library(tidyverse)
library(zoo)

rain <- read_csv("rain.csv") %>% 
  select(-X1)
wombat <- read_csv("wombat.csv") %>% 
  select(-X1) %>% 
  distinct()

rain_wombat <- left_join(rain, wombat, by = c("Date" = "date"))

rain_wombat <- rain_wombat %>% 
  mutate(rainfall_lag_2wk = as.numeric(rainfall_lag_2wk)) %>% 
  rename(rainfall = `Rainfall.amount..millimetres.`) %>% 
  replace(is.na(.), 0) %>% 
  mutate(rainfall_lag_2wk = round(rollsumr(rainfall, k = 14, fill = NA),2),
         rainfall_lag_2wk = lag(rainfall_lag_2wk)) %>% 
  filter(Date >= min(wombat$date) & Date <= max(wombat$date))

This gives you data like:

    Date       rainfall rainfall_lag_2wk
   <date>        <dbl>            <dbl>
 1 2008-04-25      0                2.4
 2 2008-04-26      0                2.4
 3 2008-04-27      4.4              0  
 4 2008-04-28      0.4              4.4
 5 2008-04-29      0                4.8
 6 2008-04-30      0                4.8
 7 2008-05-01      3.4              4.8
 8 2008-05-02      0                8.2
 9 2008-05-03      0                8.2
10 2008-05-04      0                8.2
11 2008-05-05      0                8.2
Matt
  • 7,255
  • 2
  • 12
  • 34
0

Thanks Matt for your answer, which helped me to reach the below solution.

Below is the solution I have used, partly adapted from here.

# load libraries
library(tidyverse)
library(lubridate)
library(dplyr)

# Load data
wombat <- read.csv("wombat.csv", header = TRUE)
rain <- read.csv("rain.csv", header = TRUE)

# Define dates
wombat$date <- as.Date(wombat$date, "%Y-%m-%d")
rain$Date <- as.Date(rain$Date, "%Y-%m-%d")

# Calculate rainfall for previous two weeks 
rain$rainfall_lag_2wk <- rain$Rainfall.amount..millimetres.
rain <- rain %>% mutate(rainfall_lag_2wk = map_dbl(1:n(), ~ sum(Rainfall.amount..millimetres.[(Date >= (Date[.] - days(14))) & (Date < Date[.])], na.rm = TRUE)))

wombat <- inner_join(wombat, rain, by = c("date" = "Date"))

wombat <- dplyr::select(wombat, date, rainfall_lag_2wk.y)

Pat Taggart
  • 321
  • 1
  • 9