2

I want to create a graph that plots only the maximum values (depth) per hour over the entire dataset (FYI, my datasets are hundreds of thousands of data points). I have been told that R likes it better if times are in integers, so I have already translated my times into minutes and hours.

Ex of dataset:

Date        Time     Minutes past midnight  Hours past midnight Depth
4-Nov-08  21:19:00    1279              21.3167        3

Eventually we are trying to create a cyclical GAM from this data, so if anyone has any code for that also would be amazing!

Thanks!

r2evans
  • 141,215
  • 6
  • 77
  • 149
Jackie Reuder
  • 21
  • 1
  • 3
  • 1
    Times do better as integer *or numeric*, not strictly integers. Having said that, please don't make us reproduce that which you have already solved. The best way to share sample data that contains `POSIXt` or `Date`-class columns is to use `dput` (see https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info). – r2evans Mar 11 '22 at 18:31
  • Does this answer your question? https://stackoverflow.com/q/29903188/3358272 – r2evans Mar 11 '22 at 18:33
  • I don't think so. I want R to put only the maximum values from each hour into a plot where X is the hour and Y is the only the maximum for that hour, across my dataset. I don't necessarily need to have a table of the maximums per hour, just a plot – Jackie Reuder Mar 11 '22 at 18:43
  • Ummm ... to produce a plot, you need the data. Aggregation of data in a frame is typically stored in a frame (aka table). – r2evans Mar 11 '22 at 18:59

1 Answers1

0

Here is one approach you can take

library(tidyverse)
library(lubridate)

dt <- data.frame(timestamp = c(
  '2022-03-04 01:00:49',
  '2022-03-04 01:03:00',
  '2022-03-04 02:00:01',
  '2022-03-04 02:00:00',
  '2022-03-04 02:00:49',
  '2022-03-04 02:00:00',
  '2022-03-04 03:02:00',
  '2022-03-04 03:33:00',
  '2022-03-04 03:45:00'),
  depth = c(1,2,3,4,5,6,7,8,9)
)

dt

dt %>% 
  mutate(hour = hour(timestamp)) %>% 
  group_by(hour) %>% 
  summarise(max_depth = max(depth)) %>%
  ggplot(aes(y = max_depth, x = hour)) + geom_point()

## timestamp             val
## 2022-03-04 01:00:49   1
## 2022-03-04 01:03:00   2
## 2022-03-04 02:00:01   3
## 2022-03-04 02:00:00   4
## 2022-03-04 02:00:49   5
## 2022-03-04 02:00:00   6
## 2022-03-04 03:02:00   7
## 2022-03-04 03:33:00   8
## 2022-03-04 03:45:00   9

enter image description here

Robert Long
  • 5,722
  • 5
  • 29
  • 50
  • Does this answer you question ? If so, please mark it as the accepted answer and also consider an upvote ! If not, please let us know why, so that it can be improved :) – Robert Long Mar 21 '22 at 21:21