1

I have the following vector of temperatures (in ˚ C):

Temperature <- c(-3:3, 3:-3, rep(-3, 2), -2:-1, 1:3, 2:1, -1:-4)

I need to calculate the time (number of observations) elapsed since the last freeze event and I also need to calculate the number of observations elapsed since the last thaw event. Freeze events are marked by temperature transitions from positive to negative values, and thaw events are marked by temperature transitions from negative to positive values. The output should look like these vectors:

Time_Since_Last_Freeze <- c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 0, 1, 2, 3)
Time_Since_Last_Thaw <- c(NA, NA, NA, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 0, 1, 2, 3, 4, 5, 6, 7, 8)

I've seen a few similar questions on Stack Overflow but none of them are exactly what I need. What are some efficient ways to generate these two output vectors?

David Moore
  • 670
  • 3
  • 15
  • What values exactly correspond to "freeze" and "thaw" evens in your data? – MrFlick May 18 '22 at 14:26
  • I updated my question - freeze events are when temperatures transition from positive to negative values and thaw events are when temperatures transition from negative to positive values – David Moore May 18 '22 at 14:31
  • But 0 is still freezing? – Chris May 18 '22 at 14:33
  • That's a good question. Fortunately, in my actual data, my temperatures are measured pretty accurately and we don't have any that are exactly `0`, so that's not a huge concern for me. – David Moore May 18 '22 at 14:39

3 Answers3

2

You can use this function, which basically checks the index of freeze and thaw in the original vector, then compute apply a sequence of consecutive number of length dif between every freeze or thaw moments:

f <- function(temp, freeze){
  if(freeze)
    idx <- which(temp <= 0 & dplyr::lag(temp) > 0)
  else
    idx <- which(temp >= 0 & dplyr::lag(temp) < 0)
  
  diff <- diff(c(idx, length(temp) + 1))
  vec <- rep(NA, length(temp))
  vec[min(idx):length(temp)] <- unlist(sapply(diff, \(x) seq_len(x) - 1))
  vec
}

output

f(Temperature, freeze = TRUE)
[1] NA NA NA NA NA NA NA NA NA NA  0  1  2  3  4  5  6  7  8  9 10 11 12  0  1  2  3

f(Temperature, freeze = FALSE)
[1] NA NA NA  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14  0  1  2  3  4  5  6  7  8
Maël
  • 45,206
  • 3
  • 29
  • 67
  • Thanks. Can you share that `base` solution you had up a few minutes ago too? – David Moore May 18 '22 at 14:49
  • It was a "drafty" version of this function, with the exact same functions. In this function, only `dplyr::lag` is not base R. Do you prefer a 100% base R option? – Maël May 18 '22 at 14:52
  • I'd love to see a `base` R solution. Also, I'm going to update my question - in my actual data, I don't have any exact `0`s, so we can't really use `0` as a benchmark value. – David Moore May 18 '22 at 14:56
  • See edit. For a base R version of dplyr::lag, it's a bit tricky, you can check here: https://stackoverflow.com/questions/56807120/lag-and-lead-in-base-r – Maël May 18 '22 at 15:06
  • Maël - what does the backslash do? – David Moore May 18 '22 at 15:20
  • 1
    it is a shortcut for function, introduced in R 4.1.0 – Maël May 18 '22 at 15:23
0

Not your desired output, though something you might find useful in cgwtools:

temp <- c(-3:3, 3:-3, rep(-3, 2), -2:3, 2:-4)
which(temp > 0)
 [1]  5  6  7  8  9 10 20 21 22 23 24
cgwtools::seqle(which(temp > 0))
Run Length Encoding
  lengths: int [1:2] 6 5
  values : int [1:2] 5 20
> cgwtools::seqle(which(temp <= 0))
Run Length Encoding
  lengths: int [1:3] 4 9 5
  values : int [1:3] 1 11 25

by way of summarizing.

Chris
  • 1,647
  • 1
  • 18
  • 25
0

Another possible solution is to use tidyverse after having converted Temperature to a dataframe:

library(tidyverse)
library(rlang)

reduce(list(data.frame(Temperature), "thaw", "freeze"), \(y, x) y %>%
          mutate(!!x := if (x == "thaw")
            ifelse(lag(Temperature < 0) & Temperature >= 0, row_number(), NA ) else
              ifelse(lag(Temperature > 0) & Temperature <= 0, row_number(), NA )) %>%
          fill(all_of(x)) %>% 
          group_by(!!parse_expr(x)) %>% 
          mutate(!!x := ifelse(is.na(!!parse_expr(x)), NA, row_number()-1)) %>% 
          ungroup)

#> # A tibble: 27 x 3
#>    Temperature  thaw freeze
#>          <dbl> <dbl>  <dbl>
#>  1          -3    NA     NA
#>  2          -2    NA     NA
#>  3          -1    NA     NA
#>  4           0     0     NA
#>  5           1     1     NA
#>  6           2     2     NA
#>  7           3     3     NA
#>  8           3     4     NA
#>  9           2     5     NA
#> 10           1     6     NA
#> # ... with 17 more rows
PaulS
  • 21,159
  • 2
  • 9
  • 26