1

I'm working with accelerometer data (the SB column) and would like to add a variable that counts the length of an activity bout e.g. sitting(SB), and restarts counting after the person got up ("SB_count"). In a second step, I would like to create a 2nd variable ("SB_bout" which only keeps the final bout length values.

I've been stuck on this for a while, probably because I was using the wrong search terms, so I would really appreaciate it, if someone could point me in the right direction.

This is what it should look like:

      SB      SB_count  SB_bout
1     1       1         0
2     1       2         0
3     1       3         3
4     0       0         0
5     1       1         0
6     1       2         2
Flip
  • 13
  • 4
  • Do you have any data to start with? What is your starting point? – Sven Jun 07 '19 at 10:30
  • Hi, sorry for not clarifying this. What I have is the SB-column. – Flip Jun 07 '19 at 10:52
  • Does SB-bout need to be 0 for each row except the maximum value? Or can it show the maximum value of that run for every row in that run? – Sven Jun 07 '19 at 11:44

2 Answers2

0

Think I cracked it using your toy example. For SB_bout I used @Tommy's function for finding local peaks in a vector. I think it should do the trick for other data you may have in this format, but you should have a look at the specifics of the function nevertheless.

Data <- data.frame(SB = c(1,1,1,0,1,1))

Data$SB_count <- ave(Data$SB, cumsum(Data$SB==0), FUN=cumsum)

# Find peaks function
localMaxima <- function(x) {
  # Use -Inf instead if x is numeric (non-integer)
  y <- diff(c(-.Machine$integer.max, x)) > 0L
  rle(y)$lengths
  y <- cumsum(rle(y)$lengths)
  y <- y[seq.int(1L, length(y), 2L)]
  if (x[[1]] == x[[2]]) {
    y <- y[-1]
  }
  y
}

Data$SB_bout <- Data$SB_count
Data$SB_bout[-localMaxima(Data$SB_count)] <- 0

Data

  SB SB_count SB_bout
1  1        1       0
2  1        2       0
3  1        3       3
4  0        0       0
5  1        1       0
6  1        2       2
Claudiu Papasteri
  • 2,469
  • 1
  • 17
  • 30
0

I've found a solution using rle, fill and mutate. First created your starting point:

library(tidyr)
library(dplyr)

SB <- c(1,1,1,0,1,1)
df <- data.frame(SB)

Then added the SB_count using rle. I also needed a run number in order to group afterwards:

df$SB_count <- sequence(rle(df$SB)$lengths)
df$SB_count[df$SB == 0] <- 0
nstarts <- length(df$SB_count[df$SB_count == 1])
df$run[df$SB_count == 1] <- 1:nstarts
df <- fill(df, run)
df <- df[,c(3,1:2)]

Finally grouping by run and adding the max value:

df <- df %>% group_by(run) %>%
  mutate(SB_bout = max(SB_count))

df$run[df$SB == 0] <- 0
df$SB_bout[df$SB == 0] <- 0

This gives the following output:

    run    SB SB_count SB_bout
  <dbl> <dbl>    <dbl>   <dbl>
1     1     1        1       3
2     1     1        2       3
3     1     1        3       3
4     0     0        0       0
5     2     1        1       2
6     2     1        2       2

The only difference with yours is that I'm showing the maximum SB_bout in every row of that run.

Sven
  • 1,203
  • 1
  • 5
  • 14