0

I am currently working with a dataset containing sensordata. I wish to get some summary statistics. More precisely I wish to get the number of visits, and the total occupancy length. One visit is defined if there are several 0 values over X amount of minutes after a timestamp having value 1

my data looks like this

SensorId          timestamp          value
1                 10:10:10            1
1                 10:12:10            1
1                 10:14:00            1
1                 10:16:00            0
1                 10:18:00            0
1                 10:20:00            0
2                 13:10:10            1
2                 13:12:10            1
2                 13:14:00            1
2                 13:20:00            1
2                 13:22:00            0

this is my desired result:

SensorId          total time in use          Number of visits
1                 4                             1
2                 10                            1

there are quite a lot of rows, so I wish for the total time in use, and number of visits to update each time.

Urge
  • 9
  • 1
  • Please add a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). That way you can help others to help you! – dario Feb 10 '20 at 14:16
  • Additionally, what does that mean: >One visit is defined by no time stamps, or several 0 values over X amount of minutes. – dario Feb 10 '20 at 14:20

1 Answers1

0

We can convert timestamp to POSIXct class, arrange them, group them by SensorId and consecutive similar value and take subtraction of last timestamp with the first one.

library(dplyr)

df %>%
 mutate(timestamp = as.POSIXct(timestamp, format = "%T")) %>%
 arrange(SensorId, timestamp) %>%
 group_by(SensorId, grp = data.table::rleid(value)) %>%
 summarise(total_time = round(last(timestamp) - first(timestamp)), 
           number_of_visit = first(value)) %>%
 filter(number_of_visit == 1) %>%
 select(-grp)

#  SensorId total_time number_of_visit
#     <int> <drtn>               <int>
#1        1  4 mins                  1
#2        2 10 mins                  1
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213