0

All,

I want to quantify the operation time of a remote sensor by determining if the sensor generated a value within a set time period (2 hours), which would indicate if the sensor was functioning during that time. My dataframe has a datetime variable formatted as Y-M-D H-M-S (example: 2020-04-06 09:50:00), and 1 site variable (with 6 different sites) that I want to evaluate the operation time of.

All help is appreciated.

Edit*

Here is dput of the head of my data. I'm not sure if this is how I am supposed to provide it.

structure(list(datetime = structure(c(1564618522, 1564618874, 1564618933, 
1564618994, 1564619054, 1564622122), class = c("POSIXct", 
"POSIXt"), tzone = "UTC"), fracsec = c(0.75, 0.33, 0.57, 0.1, 
0.07, 0.95), duration = c(NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_), tagtype = c(NA_character_, NA_character_, 
NA_character_, NA_character_, NA_character_, NA_character_), 
PITnum = c("999000000007426", "985121002397230", "985121002397230", 
"985121002397230", "985121002397230", "999000000007426"), 
consdetc = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
NA_integer_, NA_integer_), arrint = c(NA_integer_, NA_integer_, 
NA_integer_, NA_integer_, NA_integer_, NA_integer_), site = c("DSDS", 
"DSDS", "DSDS", "DSDS", "DSDS", "DSDS"), manuf = c("Biomark", 
"Biomark", "Biomark", "Biomark", "Biomark", "Biomark"), srcfile = c("C:\\Users\\jrjohnson\\Documents\\MoraPIT\\julyAllArraysAWformat\\dsds\\Archive\\2020-04-01_DSDS_08092019.txt", 
"C:\\Users\\jrjohnson\\Documents\\MoraPIT\\julyAllArraysAWformat\\dsds\\Archive\\2020-04-01_DSDS_08092019.txt", 
"C:\\Users\\jrjohnson\\Documents\\MoraPIT\\julyAllArraysAWformat\\dsds\\Archive\\2020-04-01_DSDS_08092019.txt", 
"C:\\Users\\jrjohnson\\Documents\\MoraPIT\\julyAllArraysAWformat\\dsds\\Archive\\2020-04-01_DSDS_08092019.txt", 
"C:\\Users\\jrjohnson\\Documents\\MoraPIT\\julyAllArraysAWformat\\dsds\\Archive\\2020-04-01_DSDS_08092019.txt", 
"C:\\Users\\jrjohnson\\Documents\\MoraPIT\\julyAllArraysAWformat\\dsds\\Archive\\2020-04-01_DSDS_08092019.txt"
), srcline = 21:26, compdate = structure(c(18353, 18353, 
18353, 18353, 18353, 18353), class = "Date")), spec = structure(list(
cols = list(datetime = structure(list(format = ""), class = 
c("collector_datetime", 
"collector")), fracsec = structure(list(), class = c("collector_double", 
"collector")), duration = structure(list(), class = c("collector_double", 
"collector")), tagtype = structure(list(), class = 
c("collector_character", 
"collector")), PITnum = structure(list(), class = c("collector_character", 
"collector")), consdetc = structure(list(), class = c("collector_integer", 
"collector")), arrint = structure(list(), class = c("collector_integer", 
"collector")), site = structure(list(), class = c("collector_character", 
"collector")), manuf = structure(list(), class = c("collector_character", 
"collector")), srcfile = structure(list(), class = 
c("collector_character", 
"collector")), srcline = structure(list(), class = c("collector_integer", 
"collector")), compdate = structure(list(format = "%Y-%m-%d"), class = 
c("collector_date", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 0), class = "col_spec"), row.names = 23803:23808, 
class = "data.frame")
JRJohnson
  • 11
  • 3
  • Provide some data in `dput()` format. See [this](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – UseR10085 Apr 06 '20 at 16:56

1 Answers1

0

Here's a way to do this using the := operator in data.table:

Sample data

library(data.table)

time_threshold <- Sys.time() + 180

dat <- data.table(
  time = seq.POSIXt(from = Sys.time(), by = 60, length.out = 10),
  value = rnorm(n = 10, mean = 10, sd = 2)
)

Code

To add a new variable based on time and value columns:

> time_threshold
[1] "2020-04-06 13:08:42 EDT"
> dat
                   time     value
 1: 2020-04-06 13:05:42  8.240336
 2: 2020-04-06 13:06:42  9.744952
 3: 2020-04-06 13:07:42  6.984802
 4: 2020-04-06 13:08:42  8.015951
 5: 2020-04-06 13:09:42 13.435096
 6: 2020-04-06 13:10:42 10.835025
 7: 2020-04-06 13:11:42  7.216484
 8: 2020-04-06 13:12:42  9.559917
 9: 2020-04-06 13:13:42  8.320369
10: 2020-04-06 13:14:42 13.201530
> dat[ time >= time_threshold & value >= 10, new_variable := 1]
> dat
                   time     value new_variable
 1: 2020-04-06 13:05:42  8.240336           NA
 2: 2020-04-06 13:06:42  9.744952           NA
 3: 2020-04-06 13:07:42  6.984802           NA
 4: 2020-04-06 13:08:42  8.015951           NA
 5: 2020-04-06 13:09:42 13.435096            1
 6: 2020-04-06 13:10:42 10.835025            1
 7: 2020-04-06 13:11:42  7.216484           NA
 8: 2020-04-06 13:12:42  9.559917           NA
 9: 2020-04-06 13:13:42  8.320369           NA
10: 2020-04-06 13:14:42 13.201530            1

You could also look at the mutate option with dplyr.

Gautam
  • 2,597
  • 1
  • 28
  • 51
  • Thanks. That is pretty close to what I am trying to do. Using your example, I would want time_threshold to to reset every time the sensor generated a new value (instead of referencing the initial value in the dataset). With my logic, that would generate an output in new_variable that was diagnostic of whether or not the sensor was functioning - on a rolling basis for a long term time series. – JRJohnson Apr 06 '20 at 17:30
  • Could you show an example of your current dataset and what you'd like it to look like (with the new columns and the logic)? – Gautam Apr 06 '20 at 17:48