0

I have a dataset with hourly measures of water discharge (m3/s) over a period of about 20 years. My date and time column is in the POSIXct format y-m-d h-m-s. I want to make a duration curve with water discharge on the y-axis, and percentage of time on the x-axis. My initial thought would be to sum all hours in the dataset (= 100% of time), as the measures of water discharge are per hour. I have added a copy of my dataframe showing how it looks now. Here, t = original date and time column, and t2 = column counting each row of t. Problem is that my goal is to have t2 begin with first hour of dataset = 1, and end with last hour of dataset = 150.000. Now it only counts hours during each day, so it stops at 00:00:00, and starts counting over again the next day at 01:00:00. I need it to continue counting.

Anyone that knows how to solve this problem?

structure(list(Date = structure(c(12143, 12143, 12143, 12143, 
12143, 12143, 12143, 12143, 12143, 12143, 12143, 12143, 12143, 
12143, 12143, 12143, 12143, 12143, 12143, 12143, 12143, 12143, 
12143, 12144, 12144, 12144, 12144, 12144, 12144, 12144, 12144, 
12144, 12144, 12144, 12144, 12144, 12144, 12144, 12144, 12144
), class = "Date"), t = structure(c(1049158800, 1049162400, 1049166000, 
1049169600, 1049173200, 1049176800, 1049180400, 1049184000, 1049187600, 
1049191200, 1049194800, 1049198400, 1049202000, 1049205600, 1049209200, 
1049212800, 1049216400, 1049220000, 1049223600, 1049227200, 1049230800, 
1049234400, 1049238000, 1049241600, 1049248800, 1049248800, 1049252400, 
1049256000, 1049259600, 1049263200, 1049266800, 1049270400, 1049274000, 
1049277600, 1049281200, 1049284800, 1049288400, 1049292000, 1049295600, 
1049299200), tzone = "UTC", class = c("POSIXct", "POSIXt")), 
    Vannføring = c(87.23, 87.23, 87.23, 87.23, 87.23, 87.23, 
    87.23, 87.23, 87.23, 89.38, 93.81, 93.81, 93.81, 96.08, 96.08, 
    96.08, 93.81, 91.57, 89.38, 89.38, 89.38, 89.38, 89.38, 48.86, 
    23.87, 22.28, 17.98, 43.53, 60.94, 87.23, 89.38, 89.38, 89.38, 
    89.38, 89.38, 89.38, 98.39, 98.39, 98.39, 98.39), t4 = c(1L, 
    2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 
    15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 0L, 2L, 2L, 
    3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 
    16L), percentage_of_time = c(0.36231884057971, 0.72463768115942, 
    1.08695652173913, 1.44927536231884, 1.81159420289855, 2.17391304347826, 
    2.53623188405797, 2.89855072463768, 3.26086956521739, 3.6231884057971, 
    3.98550724637681, 4.34782608695652, 4.71014492753623, 5.07246376811594, 
    5.43478260869565, 5.79710144927536, 6.15942028985507, 6.52173913043478, 
    6.88405797101449, 7.2463768115942, 7.60869565217391, 7.97101449275362, 
    8.33333333333333, 0, 0.72202166064982, 0.72202166064982, 
    1.08303249097473, 1.44404332129964, 1.80505415162455, 2.16606498194946, 
    2.52707581227437, 2.88808664259928, 3.24909747292419, 3.6101083032491, 
    3.97111913357401, 4.33212996389892, 4.69314079422383, 5.05415162454874, 
    5.41516245487365, 5.77617328519856)), row.names = c(NA, -40L
), groups = structure(list(Date = structure(c(12143, 12144), class = "Date"), 
    .rows = structure(list(1:23, 24:40), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))
Saron B
  • 39
  • 6
  • 1
    Try to use function `hour` from library `data.table`. – Егор Шишунов Jan 19 '22 at 08:54
  • 1
    Welcome to Stack Overflow. Please don’t use images of data as they cannot be used without a lot of unnecessary effort. [For multiple reasons](//meta.stackoverflow.com/q/285551). You’re more likely to get a positive response if your question is reproducible. [See Stack Overflow question guidance](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Peter Jan 19 '22 at 08:57
  • Thank you for that @Peter - I have now changed it and added a copy of my dataframe instead! – Saron B Jan 19 '22 at 09:14
  • @ЕгорШишунов I got this error when using hour from library data.table: Error in View : do not know how to convert 'x' to class “POSIXlt” Do I have to convert my date and time column from POSIXct to POSIXlt? – Saron B Jan 19 '22 at 09:28
  • I have not errors when I use data in your example. Can you say more about error or/and change example data. – Егор Шишунов Jan 19 '22 at 09:38
  • @ЕгорШишунов i am new to R, so maybe i am writing the code wrong. How did you write the code to my dataset? :) I did this: `data.table(hour(t))` – Saron B Jan 19 '22 at 09:59
  • 1
    write (on different lines): `library(data.table)` `df = structure( *you example code*)` `df$t3 = hour(df$t)` – Егор Шишунов Jan 19 '22 at 11:10
  • @ЕгорШишунов yep your code worked!! I got to make a percent of time column, but the problem now is that is measures the percent of time per day - i edited a copy of the dataframe to the main question to show you what I mean. I need the new t4 column to not go by date, but rather just continue counting the hours further down and not stop at 00:00:00 and start again on 1. – Saron B Jan 19 '22 at 12:47

0 Answers0