I have data set in following format:
STATION CODE DATE HOUR hr_rain
SHIVAMOGGA 163 06/09/18 00 1.0
SHIVAMOGGA 163 06/09/18 04 1.0
SHIVAMOGGA 163 06/09/18 05 NA
SHIVAMOGGA 163 06/09/18 06 1.5
SHIVAMOGGA 163 06/09/18 07 2.5
SHIVAMOGGA 163 06/09/18 08 NA
SHIVAMOGGA 163 06/09/18 09 0.0
SHIVAMOGGA 163 06/09/18 10 0.5
SHIVAMOGGA 163 06/09/18 11 0.5
SHIVAMOGGA 163 06/09/18 12 NA
SHIVAMOGGA 163 06/09/18 13 NA
SHIVAMOGGA 163 06/09/18 14 0.5
SHIVAMOGGA 163 06/09/18 15 0.5
SHIVAMOGGA 163 06/09/18 16 0.5
SHIVAMOGGA 163 06/09/18 17 0.5
SHIVAMOGGA 163 06/09/18 18 0.5
SHIVAMOGGA 163 06/09/18 19 0.5
SHIVAMOGGA 163 06/10/19 03 0.5
SHIVAMOGGA 163 06/10/19 05 NA
SHIVAMOGGA 163 06/10/19 06 NA
SHIVAMOGGA 163 06/10/19 07 NA
SHIVAMOGGA 163 06/10/19 08 0.5
SHIVAMOGGA 163 06/10/19 09 0.0
SHIVAMOGGA 163 06/10/19 10 0.0
Here the parameter rainfall is in hourly accumulated format. I'm interested into hourly rainfall rate. The measurement starts everyday at 09 HOUR & sometimes observations are missing, so I tried to fill NA values (3 or more consecutive NA's will be left unchanged and consecutive NA's less than 2 in number are replaced & NA's at 8 HOUR has given the previous value) by grouping for 09 HOUR.
df1 <- df %>%
group_by(STATION, CODE, gr = cumsum(HOUR == '09')) %>%
mutate(hr_rain = na.approx(hr_Rain, rule = 2, maxgap = 2, na.rm = FALSE))
Again to calculate hourly rainfall rate I tried to group df1
with:
hourly_df <- df1 %>%
group_by(STATION, CODE , grp = cumsum(HOUR == '09')) %>%
mutate(RAINFALL = hr_rain - lag(hr_rain, default = 0))
But it's not working. It creates first group and then second group continues till end of dataframe. The result is coming like:
STATION CODE DATE HOUR hr_rain NUM_NA gp grp RAINFALL
SHIVAMOGGA 163 06/09/18 00 1.0 2 0 0 1
SHIVAMOGGA 163 06/09/18 04 1.0 2 0 0 0
SHIVAMOGGA 163 06/09/18 05 1.25 1 0 0 0.25
SHIVAMOGGA 163 06/09/18 06 1.5 1 0 0 0.25
SHIVAMOGGA 163 06/09/18 07 2.5 1 0 0 1
SHIVAMOGGA 163 06/09/18 08 2.5 1 0 0 0
SHIVAMOGGA 163 06/09/18 09 0.0 1 1 1 -2.5
SHIVAMOGGA 163 06/09/18 10 0.5 2 1 1 0.5
SHIVAMOGGA 163 06/09/18 11 0.5 2 1 1 0
SHIVAMOGGA 163 06/09/18 12 0.5 2 1 1 0
SHIVAMOGGA 163 06/09/18 13 0.5 2 1 1 0
SHIVAMOGGA 163 06/09/18 14 0.5 7 1 1 0
SHIVAMOGGA 163 06/09/18 15 0.5 7 1 1 0
SHIVAMOGGA 163 06/09/18 16 0.5 7 1 1 0
SHIVAMOGGA 163 06/09/18 17 0.5 7 1 1 0
SHIVAMOGGA 163 06/09/18 18 0.5 7 1 1 0
SHIVAMOGGA 163 06/09/18 19 0.5 7 1 1 0
SHIVAMOGGA 163 06/10/19 03 0.5 7 1 1 0
SHIVAMOGGA 163 06/10/19 05 NA 3 1 1 NA
SHIVAMOGGA 163 06/10/19 06 NA 3 1 1 NA
SHIVAMOGGA 163 06/10/19 07 NA 3 1 1 NA
SHIVAMOGGA 163 06/10/19 08 0.5 1 1 1 0.5
SHIVAMOGGA 163 06/10/19 09 0.0 2 2 1 -0.5
SHIVAMOGGA 163 06/10/19 10 0.0 2 2 1 0
With this for 9 HOUR I'm getting negative values, which I want to start with value of hr_rain (that's why I was trying to create another grouping by 09 HOUR) for that field. Thanks in advance for any help!