I have a dataframe with dates/times (time series), site (grouping var) and value. I have identified the start times of different 'surges' - defined as changes in values of >=2 in 15 mins. For each surge time, I am trying for the date/time where the value falls back down to (or below) the start of the surge (i.e., the end of the surge).
I can achieve this by using a recursive loop function ('find.next.smaller' from this question - In a dataframe, find the index of the next smaller value for each element of a column). This works perfectly on a smaller dataframe, but not a large one. I get the error message "Error: C stack usage 15925584 is too close to the limit". Having seen other similar questions (e.g., Error: C stack usage is too close to the limit), I do not think its a problem of an infinite recursive function, but a memory issue. But I do not know how to use shell (or powershell) to do this. I wondered whether there was any other way? Either through adapting my memory or the function below?
Some example code:
###df formatting
library(dplyr)
df <- data.frame("Date_time" =seq(from=as.POSIXct("2022-01-01 00:00") , by= 15*60, to=as.POSIXct("2022-01-01 07:00")),
"Site" = rep(c("Site A", "Site B"), each = 29),
"Value" = c(10,10.1,10.2,10.3,12.5,14.8,12.4,11.3,10.3,10.1,10.2,10.5,10.4,10.3,14.7,10.1,
16.7,16.3,16.4,14.2,10.2,10.1,10.3,10.2,11.7,13.2,13.2,11.1,11.4,
rep(10.3,times=29)))
df <- df %>% group_by(Site) %>% mutate(Lead_Value = lead(Value))
df$Surge_start <- NA
df[which(df$Lead_Value - df$Value >=2),"Surge_start"] <-
paste("Surge",seq(1,length(which(df$Lead_Value - df$Value >=2)),1),sep="")
###Applying the 'find.next.smaller' function
find.next.smaller <- function(ini = 1, vec) {
if(length(vec) == 1) NA
else c(ini + min(which(vec[1] >= vec[-1])),
find.next.smaller(ini + 1, vec[-1]))
} # the recursive function will go element by element through the vector and find out
# the index of the next smaller value.
df$Date_time <- as.character(df$Date_time)
Output <- df %>% group_by(Site) %>% mutate(Surge_end = ifelse(grepl("Surge",Surge_start),Date_time[find.next.smaller(1, Value)],NA))
###This works fine
df2 <- do.call("rbind", replicate(1000, df, simplify = FALSE))
Output2 <- df2 %>% group_by(Site) %>% mutate(Surge_end = ifelse(grepl("Surge",Surge_start),Date_time[find.next.smaller(1, Value)],NA))
####This does not work