0

I have calculated hourly average using the following code in R. Hourly average is stored in an object 'avghr'. Because of the data preprocessing done to obtain 'merged' followed by 'avghr', I want to use 'avghr' to calculate daily average. I am facing error 'Error in group_by(): ℹ In argument: Time_sp = lubridate::floor_date(Time_sp, "1 day"). Caused by error in lubridate::floor_date(): ! object 'Time_sp' not found'. Could anyone please help me in resolving the error.

Code:

 merged <- rbindlist(dt.tidied, fill = TRUE, use.names = TRUE)
 
 #Required Columns
 cn <- c('Date/Time', 'C1 (C1)', 'C2 (C2)', 'C5 (C5)', 'C8 (C8)', 'C11 (C11)', 'C14 (C14)')                                      

 #Calc 15min average
 avghr <- merged %>%
 select(any_of(cn)) %>%
 as_tibble() %>%
 group_by(Time_sp = lubridate::floor_date(`Date/Time`, "60 mins")) %>%               #I want Date/Time column to be written as Time_sp in the output.
 #mutate(Time_sp = format(Date_Time, "%Y-%m-%d %H:%M:%S+00")) %>%                    # #I want Time_sp column to be written in the specified way. 
 summarise(across(where(is.numeric), ~ if(mean(is.na(.x)) > 0.5) NA else mean(.x, na.rm = 
                                                                               TRUE)))

write.csv(avghr, paste0(dirlist[idx],"_hr.csv"), row.names = FALSE)


 #Calc daily average
 avgdl <- avghr %>%
 select(any_of(cn)) %>%
 as_tibble() %>%
 group_by(Time_sp = lubridate::floor_date(`Time_sp`, "1 day")) %>%
 #mutate(Time_sp = format(Date_Time, "%Y-%m-%d %H:%M:%S+00")) %>%
 summarise(across(where(is.numeric), ~ if(sum(is.na(.x)) > 1) NA else mean(.x, na.rm = 
                                                                               TRUE)))

write.csv(avgdl, paste0(dirlist[idx],"_dly.csv"), row.names = FALSE)

}

I understand that the mutated Time_sp column is no longer in Date/Time format. But even if I remove the particular code line mutate(Time_sp...) while calculating avghr, it still gives me the same error.

I am able to calculate hourly or daily average individually using the same code. The problem is occurring while using hourly average object to calculate daily average. Can anyone please help.

Output of dput(head(avghr))

structure(list(Time_sp = structure(c(788918400, 788922000, 788925600, 
788929200, 788932800, 788936400), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), `C1 (C1)` = c(84.6666666666667, 90.95, 94.7333333333333, 
95.95, 95.4666666666667, 90.3833333333333), `C2 (C2)` = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `C5 (C5)` = 
c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `C8 
(C8)` = c(911.983333333333, 1062.41666666667, 1147.88333333333, 
1156.15, 1089.73333333333, 956.233333333333), `C11 (C11)` = 
c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), `C14 
(C14)` = c(1139.86233333333, 1304.17933333333, 1394.967, 
1406.02683333333, 1336.59633333333, 1191.40616666667)), row.names = 
c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
  • I'm afraid the included code (with `Time_sp`) does not match that error ( `Date/time` without backticks). Anyway, use the same `merged` as input: `avgdl <- merged %>% ... %>% group_by(Time_sp = lubridate::floor_date(`Date/Time`, "1 day")) %>% ...` – margusl May 29 '23 at 12:48
  • @margusl Thanks for the response. I rechecked the error. It is the same. Date/Time has backticks. I mentioned it in the question too, I am not sure why it is not appearing. Thanks for the suggestion but as per the protocol of my task, I need to calculate monthly average using daily average OBJECT. Can you please help in this. – Alexia k Boston May 29 '23 at 14:58
  • Even if it's not about backtick, your error is still complaining about missing **Date/Time** in `floor_date(Date/Time, "1 day")` while in your code there's no such expression, instead, it uses **Time_sp**. Hence the comment about the code and error not matching. If you decide to update the question, including some sample data (`dput(head(avghr)`) would increase the chance of getting any valuable feedback, without it it's just a guesswork. – margusl May 29 '23 at 16:15
  • @margusl I am sorry. You were correct. Error is about Time_sp and not Date/Time. Sorry again. I have updated the dput(head(avghr)) – Alexia k Boston May 29 '23 at 16:40
  • Also, presently I am running the code without using mutate code (considering that it will convert Date/Time column to character) . But I need to have Time in hh:mm:ss+00 format in the output. – Alexia k Boston May 29 '23 at 16:46

1 Answers1

1
#Calc 15min average
avghr <- merged %>%
  select(any_of(cn)) %>%
  as_tibble() %>%
  group_by(Time_sp = lubridate::floor_date(`Date/Time`, "60 mins")) %>% 
  summarise(across(where(is.numeric), ~ if(mean(is.na(.x)) > 0.5) NA else mean(.x, na.rm = TRUE)))

mutate(avghr, Time_sp = format(Time_sp, "%Y-%m-%d %H:%M:%S+00")) %>%
  write.csv(paste0(dirlist[idx],"_hr.csv"), row.names = FALSE)

#Calc daily average
avgdl <- avghr %>%
  group_by(Time_sp = lubridate::floor_date(`Time_sp`, "1 day")) %>%
  summarise(across(where(is.numeric), ~ if(sum(is.na(.x)) > 1) NA else mean(.x, na.rm = TRUE))) %>% 
  mutate(Time_sp = format(Time_sp, "%Y-%m-%d %H:%M:%S+00"))

write.csv(avgdl, paste0(dirlist[idx],"_dly.csv"), row.names = FALSE)
margusl
  • 7,804
  • 2
  • 16
  • 20
  • Thank you very much. It worked :)) But as I mentioned in my question that I need to have Time (for hourly as well as daily average) written in a specified manner (i.e. yyyy-mm-ss+00) (the reason why I used mutate), how can I achieve that? I am wondering if it is possible to use mutate function after summarise... and store the mutated hourly average (for say avghr1) in an object and write that in a csv output and use without mutated hourly average object (avghr) to calculate daily average. Or do you suggest any other way? – Alexia k Boston May 29 '23 at 19:23
  • It makes sense to format something to string at the very last step, but here the order doesn't really change anything, Formatting to string would have only failed because there was no column named `Date_Time` (in your commented-out lines you have `format(Date_Time, ...)` , `Date_Time` is the actual column name but you see from your own `dput()` output there is no such variable) – margusl May 29 '23 at 20:41
  • By the way, something weird is going on with your `dput()` output, -- there are linebreaks in bit unexpected places. If it's some kind of manual formatting, it is not necessary and rather complicates things. – margusl May 29 '23 at 20:52
  • Thank you. But what about the specified formatting of Time_sp column for hourly average? I need Time_sp column for hourly average also in the specified format (+00) but if I use mutate there, will Time_sp not be a unsupported format for grouping for 1day? I hope that I am able to make myself clear. – Alexia k Boston May 29 '23 at 21:13
  • In dput() output, I only changed the column names manually, to have it as a sample data. – Alexia k Boston May 29 '23 at 21:15
  • You mean the current answer does not formats dates as `1995-01-01 00:00:00+00` ? – margusl May 29 '23 at 21:59
  • It does format the dates but either for hourly average or daily average. And I want it to format the dates for both, hourly average as well as daily average. The issue is if I put mutate command in hourly average calculation to format the dates, then because the Dates haven been converted to string, the code stops in between without running further to calculate the daily average. You can see my code in question. Please let me know if I am not able to explain the issue completely. – Alexia k Boston May 29 '23 at 22:15
  • You mean like in the current answer? As there's no input data to test with, it might include few typos here and there. – margusl May 29 '23 at 22:44
  • I think you have missed to put %>% at the end of summarize.. in calculating 60min average. I added it in my code. And on running the entire code mentioned in your answer, it is giving me error - Error in UseMethod("group_by") : no applicable method for 'group_by' applied to an object of class "NULL" As use of mutate in 60min average calculation converts Time_sp (Date/Time) to string, Time_sp is no longer in a Date/Time format to be used for calculation of 1 day average. What do you think and suggest to resolve it? – Alexia k Boston May 30 '23 at 13:24
  • First mutate was missing the fisrt parameter. Connecting summarise and mutate with a `%>%` pipe would break everything after first `wirte.csv()`. – margusl May 30 '23 at 13:48
  • Okay. On copying your answer as it is, it gives me this error: Error in as.POSIXlt.character(x, tz, ...) : character string is not in a standard unambiguous format – Alexia k Boston May 30 '23 at 13:57
  • Hi, as your answer was able to solve the error I mentioned in the above question. I accepted and voted for your answer. However, the point I was discussing with you has not been resolved with your answer. I resolved it this way https://stackoverflow.com/questions/76367162/looking-for-an-easier-way-to-do-it-in-r and looking for a simpler way to do it, if there is any. If you can help, it would be great. – Alexia k Boston May 30 '23 at 17:20
  • I would give that exact answer to the linked question too. As said before, it's not tested and there could be some silly typo. But without a reproducible example that would generate that exact issue you are seeing, one can easily spend hours before getting the answer right. – margusl May 30 '23 at 18:30
  • 1
    Thank you. I have added the input and desired output in the linked question. – Alexia k Boston May 30 '23 at 20:29