-1

I would like to count the largest number of consecutive days where my variable is above 50. My dataset is like :

dp <- dput(head(df, 20))

dp = structure(list(day = 1:20, month = c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), year = c(1990L, 
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 
1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 1990L, 
1990L), variable = c(46.8, 51.3, 51.2, 51.9, 51.4, 50.9, 51.4, 
51.6, 51.5, 49.9, 49.4, 49.1, 51.7, 51.8, 50.9, 51, 51.9, 52.5, 
52.5, 49.1)), .Names = c("day", "month", "year", "variable"), row.names = c(NA, 
20L), class = "data.frame")

Thanks a lot in advance

A.Alfredo
  • 21
  • 5
  • 1
    Please provide a reproducible sample using `dput` maybe of the first 20 or so lines `dput(head(df, 20))`. – lmo Aug 03 '16 at 13:51
  • Please see [how to create a reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) . Include that that can actually copied into R and that we can use for testing (stuff with ... isn't helpful). – MrFlick Aug 03 '16 at 14:00
  • 1
    In your example all days are consecutive. So what would be your expected output? – Sotos Aug 03 '16 at 14:20
  • 1
    I'm finding spells of length 8 and 7 in your example, not 2 and 10. – Frank Aug 03 '16 at 14:33

1 Answers1

3

You can use rle and its inverse function. I use data.table here for its easy group-by functionality:

fun <- function(x, lim) {
  y <- x > lim
  z <- rle(y)
  z$values[-which.max(z$lengths)] <- FALSE
  inverse.rle(z)
} 

library(data.table)
setDT(dp)
dp[, {
  ind <- fun(variable, 50)
  list(count = sum(ind), start_day = day[ind][1], end_day = tail(day[ind], 1))
}, by = .(month, year)]
#   month year count start_day end_day
#1:     1 1990     8         2       9

Obviously, your example data is all from the same month.

Roland
  • 127,288
  • 10
  • 191
  • 288