I have a dataframe (data) that contains a variable with the date+time and some other variables. What I want to have is a new data frame where each row of the "old" df is a summary (e.g., mean) of each instances that fall into the past 15 minutes.
I tackled this with the following code (I shortened the variables to 1, actually I've about 26):
#### SEE EDIT ! ###
library(lubridate)
# Make a reference df to start rbind later
chunks <- data.frame("unix_timestamp" = as.POSIXct("2018-12-01 08:47:00 CET"),
"Var1" = NA)
# Start loop for each row in data
for (i in 1:nrow(data)) {
help <- data[as.POSIXct(data[,1]) > (as.POSIXct(data[i,1]) - minutes(xmin)) &
as.POSIXct(data[,1]) <= as.POSIXct(data[i,1]),] # Help data frame with time frame selection
chunk <- data.frame("unix_timestamp" = as.POSIXct(data[i,1]),
"Var1" = mean(help$Var1))
chunks <- rbind(chunks, chunk)
}
#Delete initial row
chunks <- chunks[-1,]
I'm satisfied with the output and when I have a dataframe of ~500 observations the speed is okay. However, I have some data sets with 60,000 rows and this runs almost for ever.
I know others had a similar problem such as here, but unfortunately I was not able to implement it!
I appreciate any help!
Best!
EDIT:
library(lubridate)
data <- data.frame("unix_timestamp" = c("2015-05-01 14:12:57",
"2015-05-01 14:14:57",
"2015-05-01 14:15:57",
"2015-05-01 14:42:57",
"2015-05-01 14:52:57"),
"Var1" = c(2,3,4,2,1),
"Var2" = c(0.53,0.3,0.34,0.12,0.91),
"Var3" = c(1,1,1,1,1))
pre <- vector("list", nrow(data))
data
for (i in 1:length(pre)) {
#to see progress
print(paste(i, "of", nrow(data), sep = " "))
help <- data[as.POSIXct(data[,1]) > (as.POSIXct(data[i,1]) - minutes(15)) &
as.POSIXct(data[,1]) <= as.POSIXct(data[i,1]),] # Help data frame with time frame selection
chunk <- data.frame("unix_timestamp" = as.POSIXct(data[i,1]),
"Var1" = mean(help$Var1),
"Var2" = mean(help$Var2),
"Var3" = sum(help$Var3))
pre[[i]] <- chunk
}
output <- do.call(rbind, pre)
output
unix_timestamp Var1 Var2 Var3
1 2015-05-01 14:12:57 2.0 0.530 1
2 2015-05-01 14:14:57 2.5 0.415 2
3 2015-05-01 14:15:57 3.0 0.390 3
4 2015-05-01 14:42:57 2.0 0.120 1
5 2015-05-01 14:52:57 1.5 0.515 2