I have a data frame with several variables, and whose first columns look like this:
Place <- c(rep("PlaceA",14),rep("PlaceB",15))
Group_Id <- c(rep("A1",5),rep("A1",6),rep("A2",3),rep("B1",6),rep("B2",4),rep("B2",5))
Time <- as.Date(c("2018-01-15","2018-02-03","2018-02-27","2018-03-10","2018-03-18","2019-02-02","2019-03-01","2019-03-15","2019-03-28","2019-04-05","2019-04-12","2018-02-01",
"2018-03-01","2018-04-07","2018-01-17","2018-01-27","2018-02-17","2018-03-03","2018-04-02","2018-04-25","2018-03-03","2018-03-18","2018-04-08","2018-04-20",
"2019-01-23","2019-02-09","2019-02-27","2019-03-12","2019-03-30"))
FollowUp <- c("start",paste("week",week(ymd(Time[2:5]))),"start",paste("week",week(ymd(Time[7:11]))),"start",paste("week",week(ymd(Time[13:14]))),"start",paste("week",week(ymd(Time[16:20]))),"start",paste("week",week(ymd(Time[22:24]))),"start",paste("week",week(ymd(Time[26:29]))))
exprmt <- c(rep(1,5),rep(2,6),rep(3,3),rep(4,6),rep(5,4),rep(6,5))
> df1
Place Group_Id Time exprmt FollowUp
1 PlaceA A1 2018-01-15 1 start
2 PlaceA A1 2018-02-03 1 week 5
3 PlaceA A1 2018-02-27 1 week 9
4 PlaceA A1 2018-03-10 1 week 10
5 PlaceA A1 2018-03-18 1 week 11
6 PlaceA A1 2019-02-02 2 start
7 PlaceA A1 2019-03-01 2 week 9
8 PlaceA A1 2019-03-15 2 week 11
9 PlaceA A1 2019-03-28 2 week 13
10 PlaceA A1 2019-04-05 2 week 14
11 PlaceA A1 2019-04-12 2 week 15
12 PlaceA A2 2018-02-01 3 start
13 PlaceA A2 2018-03-01 3 week 9
14 PlaceA A2 2018-04-07 3 week 14
15 PlaceB B1 2018-01-17 4 start
16 PlaceB B1 2018-01-27 4 week 4
17 PlaceB B1 2018-02-17 4 week 7
18 PlaceB B1 2018-03-03 4 week 9
19 PlaceB B1 2018-04-02 4 week 14
20 PlaceB B1 2018-04-25 4 week 17
21 PlaceB B2 2018-03-03 5 start
22 PlaceB B2 2018-03-18 5 week 11
23 PlaceB B2 2018-04-08 5 week 14
24 PlaceB B2 2018-04-20 5 week 16
25 PlaceB B2 2019-01-23 6 start
26 PlaceB B2 2019-02-09 6 week 6
27 PlaceB B2 2019-02-27 6 week 9
28 PlaceB B2 2019-03-12 6 week 11
29 PlaceB B2 2019-03-30 6 week 13
For each Place (more than 2 in my actual data), I have a separate data frame with temperature records by hours. For example:
set.seed(1032)
t <- c(seq.POSIXt(from = ISOdate(2018,01,01),to = ISOdate(2018,06,01), by = "hour"),seq.POSIXt(from = ISOdate(2019,01,01),to = ISOdate(2019,06,01), by = "hour"))
temp_A <- runif(length(t),min = 5, max = 25)
temp_B <- runif(length(t),min = 3, max = 32)
data_A <- data.frame(t,temp_A)
data_B <- data.frame(t,temp_B)
> head(data_A)
t temp_A
1 2018-01-01 12:00:00 14.24961
2 2018-01-01 13:00:00 21.64925
3 2018-01-01 14:00:00 21.77058
4 2018-01-01 15:00:00 13.31673
5 2018-01-01 16:00:00 16.10350
6 2018-01-01 17:00:00 17.64567
I need to add a column in df1
with average temperature for the time interval by Place, group_Id and exprmt: the first of each group_by
should be a NaN, than I would need the average for each time interval. Knowing that for each Place, the data are also in a separate data frame.
I tried something like this, but it is not working:
df1 <- df1 %>% group_by(Place,Group_Id,exprmt) %>% mutate(
temp = case_when(FollowUp == "start" & Place == "PlaceA" ~ NA,
FollowUp == FollowUp[c(2:n())] & Place == "PlaceA" ~ mean(temp_A[c(which(date(temp_A$t))==lag(Time,1):which(date(temp_A$t))==Time),2]),
)
)
I found information on how calculate averages over multiple dataframes (e.g. this or this), but this is not what I am looking for. I would like to do it without a loop. My expected results is (etc stand for and so on..):
> df1
Place Group_Id Time exprmt FollowUp expected
1 PlaceA A1 2018-01-15 1 start NaN
2 PlaceA A1 2018-02-03 1 week 5 mean temp_A between 2018-01-15 and 2018-02-03
3 PlaceA A1 2018-02-27 1 week 9 mean temp_A between 2018-02-03 and 2018-02-27
4 PlaceA A1 2018-03-10 1 week 10 mean temp_A between 2018-02-27 and 2018-03-10
5 PlaceA A1 2018-03-18 1 week 11 mean temp_A between 2018-03-10 and 2018-03-18
6 PlaceA A1 2019-02-02 2 start NaN
7 PlaceA A1 2019-03-01 2 week 9 mean temp_A between 2019-02-02 and 2019-03-01
8 PlaceA A1 2019-03-15 2 week 11 etc
9 PlaceA A1 2019-03-28 2 week 13 etc
10 PlaceA A1 2019-04-05 2 week 14 etc
11 PlaceA A1 2019-04-12 2 week 15 etc
12 PlaceA A2 2018-02-01 3 start etc
13 PlaceA A2 2018-03-01 3 week 9 etc
14 PlaceA A2 2018-04-07 3 week 14 etc
15 PlaceB B1 2018-01-17 4 start NaN
16 PlaceB B1 2018-01-27 4 week 4 mean temp_B between 2018-01-17 and 2018-01-27
17 PlaceB B1 2018-02-17 4 week 7 etc
18 PlaceB B1 2018-03-03 4 week 9 etc
19 PlaceB B1 2018-04-02 4 week 14 etc
20 PlaceB B1 2018-04-25 4 week 17 etc
21 PlaceB B2 2018-03-03 5 start etc
22 PlaceB B2 2018-03-18 5 week 11 etc
23 PlaceB B2 2018-04-08 5 week 14 etc
24 PlaceB B2 2018-04-20 5 week 16 etc
25 PlaceB B2 2019-01-23 6 start etc
26 PlaceB B2 2019-02-09 6 week 6 etc
27 PlaceB B2 2019-02-27 6 week 9 etc
28 PlaceB B2 2019-03-12 6 week 11 etc
29 PlaceB B2 2019-03-30 6 week 13 etc
Any help will be appreciated!