0

I have a dataset within a reactive dygraph that looks like this:

data()$data.o

date data.o
2022-07-21 12:10 AM 400.1
2022-07-21 12:11 AM 33.9
2022-07-21 12:12 AM 32.5
2022-07-21 12:13 AM 35.1
2022-07-21 12:14 AM 31.5
2022-07-21 12:15 AM 39.5

I want to find the max value in the last 5 minutes so I can set my axis scale accordingly.

I've tried:

enddate = max(data()$date)
startdate = enddate - (60*5)
oMx <- max(datao.xts[startdate/enddate], na.rm = T)

But I get an error using datao.xts.

Is there a better way to go about this?

Edit:

Trying on a larger dataset for the last 480 minutes, returns -inf:

xts <- xts::xts(hvilleo, order.by = data()$date)
enddate <- end(xts) startdate <- enddate - (480-1) * 60
xts[paste0(startdate, enddate, sep = "/")] |> max(na.rm = TRUE) xts |>
utils::tail(480) |> max(na.rm =TRUE)
Kyle
  • 403
  • 4
  • 15

1 Answers1

2

As far as I'm familiar with xts, your startdate/enddate part has simply to be a string, c.f. Joshua's response here. That's it.

# allow me to create an xts object beforehand
datetimes <- c("2022-07-21 12:10 AM",
               "2022-07-21 12:11 AM",
               "2022-07-21 12:12 AM",
               "2022-07-21 12:13 AM",
               "2022-07-21 12:14 AM",
               "2022-07-21 12:15 AM") |> 
  strptime(format = "%Y-%m-%d %I:%M %p") |> 
  as.POSIXct()

data <- c(400.1, 33.9, 32.5, 35.1, 31.5, 39.5)

xts <- xts::xts(data, order.by = datetimes)

# minor adjustments to your approach
enddate <- end(xts)
startdate <- enddate - (5-1) * 60

xts[paste0(startdate, enddate, sep = "/")] |> max()
#> [1] 39.5

# assuming you are interested in the last five observations
xts |> utils::tail(5) |> max()
#> [1] 39.5
dimfalk
  • 853
  • 1
  • 5
  • 15
  • `as.POSIXct(x, format = fmt)` can be used to create `datetimes` where x is the character vector of date times and is the format character string with the format. Also if x is an xts object `end(x)` is the last and largest index. – G. Grothendieck Jul 22 '22 at 17:57
  • Hey! `end(xts)` to replace `zoo::index(xts) |> max()` is neat, thank you very much for your hint! But how come `as.POSIXct(c(...), format = "%Y-%m-%d %I:%M %p")` is not equal to the pipe I used `c(...) |> strptime(format = "%Y-%m-%d %I:%M %p") |> as.POSIXct()`? `all.equal()` gives me the following output: `"Mean absolute difference: 43200"`. Seems like the AM/PM indicator `%p` fails. – dimfalk Jul 22 '22 at 19:40
  • Time zone is my local one per default using both approaches. Same applies to the origin. Explicit specifications `origin = "1970-01-01", tz = "etc/GMT-2"` does not change anything. Differences remain in the numeric representation of the timestamps. Looks like there is a deviation in behaviour between `strptime(format = x)` and `as.POSIXct(fmt = x)` when it comes to `%I` in combination with `%p`. Are you able to reproduce this with the code chunk stated in the upper comment? – dimfalk Jul 23 '22 at 06:15
  • I can't reproduce this. If I run the following code under R 4.2.0 then I get TRUE. `d <- c("2022-07-21 12:10 AM", "2022-07-21 12:11 AM", "2022-07-21 12:12 AM", "2022-07-21 12:13 AM", "2022-07-21 12:14 AM", "2022-07-21 12:15 AM"); fmt <- "%Y-%m-%d %I:%M %p"; t1 <- as.POSIXct(d, format = fmt); t2 <- as.POSIXct(strptime(d, format = fmt)); identical(t1, t2)` – G. Grothendieck Jul 23 '22 at 16:56
  • Neither can I at the moment. That's pretty weird, but ok, at least it seems to be quite fine right now. Thank you very much for your efforts! – dimfalk Jul 23 '22 at 17:01
  • Thank you for this. I'm trying this on a bigger dataset that includes some blank values and I'm getting an "NA" returned when I try to get the max. How would I modify that code to include na.rm = T? – Kyle Jul 27 '22 at 15:53
  • If I add |> max(na.rm = TRUE) it returns -inf – Kyle Jul 27 '22 at 16:09
  • According to [this post](https://stackoverflow.com/a/24520992/11709296), if all your values of `xts[paste0(startdate, enddate, sep = "/")]` are `NA`, `-Inf` is returned. There is a workaround mentioned in the answer provided by @coffeinjunky. – dimfalk Jul 27 '22 at 19:23
  • There are only a few blank values in the data, its less than 10 percent blanks. If I don't include na.rm=TRUE the value I get is NA, if I do include it, I get -inf. – Kyle Jul 27 '22 at 20:13
  • It would be easier to help if you apply `dput()` on a subset of your data (and edit your question providing the object) allowing others to reproduce this behaviour. – dimfalk Jul 27 '22 at 21:44
  • I attempted dput(), hopefully I did it right! – Kyle Jul 28 '22 at 13:25
  • 1
    A smaller subset would have been sufficient, but hey.. ;-) The last 27 values of your object are `NA`, so there is no way you can get a meaningful value if you feed only `NA` values to `max()`. Your larger subset with `n = 480` returns 411 in both cases when making use of `max(na.rm =TRUE)`. Seems to be working according to specifications from my point of view. – dimfalk Jul 28 '22 at 15:09
  • I'm struggling because I'm actually working in a reactive dataframe so my data() cannot be subsetted. Anyway, I didn't change a thing and it seems to be working now, now sure why or how but thanks again! – Kyle Jul 28 '22 at 18:09