0

I am trying to plot water elevation data vs river stage and precipitation data. My Water elevation data is reported on an hourly basis while I only have daily precipitation and river stage values. Because I have them in the same dataframe, I placed all the daily measurements at the same time of 12:00:00 each day that way they would be presented in the middle of the day.

My data frame is presented as such:

   Date                River       Rain     Well 1
   1/1/2021 00:00      NA          NA       422
   1/1/2021 01:00      NA          NA       421.8
   1/1/2021 02:00      NA          NA       421.7
   1/1/2021 03:00      NA          NA       421
   1/1/2021 04:00      NA          NA       421.3
   1/1/2021 05:00      NA          NA       421
   1/1/2021 06:00      NA          NA       421
   1/1/2021 07:00      NA          NA       420.7
   1/1/2021 08:00      NA          NA       420.6
   1/1/2021 09:00      NA          NA       420.9
   1/1/2021 10:00      NA          NA       421.4
   1/1/2021 11:00      NA          NA       421.4
   1/1/2021 12:00      430         1.5      421

My issue is arising from the format of the data in the Date column which is reported as is from excel in YYYY-MM-DD HH-MM-SS format. Upon initially uploading the excel file sapply & lapply report it as a numeric in the same format as the excel.

However if I convert it using as.Date it returns it as a numeric in the format YYYY-MM-DD creating 24 YYYY-MM-DD values in my date column for each day. I was using the following code to transform it:

   df <- df %>% transform(Date = as.Date(Date))

I have also tried to use:

  df <- df %>% ymd_hms(Date)

However this gives an error and replaces all values in the Date column with NA.

When I plot the data after I use as.Date it only reports a single measurement for each day instead of the hourly data. However when I don't transform the Date and leave it as is, I get the error:

   Error: Invalid input: date_trans works with objects of class Date only

All other data is in numeric format. Really appreciate any kind of help.

JackWassik
  • 65
  • 6
  • It would be easier to help you if you provide [a minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) including a snippet of your data or some fake data. As is your question provides no information about what could be the issue. – stefan Jul 13 '22 at 19:15
  • Sorry about that. i have added a sample of the data.frame – JackWassik Jul 13 '22 at 19:22

1 Answers1

0

The issue is that converting using as.Date will drop the hours. To keep the hours use as.POSIXct. Also, your dates are not in YYYY-MM-DD format. To account for that you have to specify the format. But I'm not sure whether this will fix the issue with your plot.


library(dplyr)

df %>% 
  transform(Date = as.POSIXct(Date, format = "%d/%m/%Y %H:%M"))
#>                   Date River Rain Well.1
#> 1  2021-01-01 00:00:00    NA   NA  422.0
#> 2  2021-01-01 01:00:00    NA   NA  421.8
#> 3  2021-01-01 02:00:00    NA   NA  421.7
#> 4  2021-01-01 03:00:00    NA   NA  421.0
#> 5  2021-01-01 04:00:00    NA   NA  421.3
#> 6  2021-01-01 05:00:00    NA   NA  421.0
#> 7  2021-01-01 06:00:00    NA   NA  421.0
#> 8  2021-01-01 07:00:00    NA   NA  420.7
#> 9  2021-01-01 08:00:00    NA   NA  420.6
#> 10 2021-01-01 09:00:00    NA   NA  420.9
#> 11 2021-01-01 10:00:00    NA   NA  421.4
#> 12 2021-01-01 11:00:00    NA   NA  421.4
#> 13 2021-01-01 12:00:00   430  1.5  421.0

DATA

df <- structure(list(Date = c(
  "1/1/2021 00:00", "1/1/2021 01:00", "1/1/2021 02:00",
  "1/1/2021 03:00", "1/1/2021 04:00", "1/1/2021 05:00", "1/1/2021 06:00",
  "1/1/2021 07:00", "1/1/2021 08:00", "1/1/2021 09:00", "1/1/2021 10:00",
  "1/1/2021 11:00", "1/1/2021 12:00"
), River = c(
  NA, NA, NA, NA,
  NA, NA, NA, NA, NA, NA, NA, NA, 430L
), Rain = c(
  NA, NA, NA, NA,
  NA, NA, NA, NA, NA, NA, NA, NA, 1.5
), `Well 1` = c(
  422, 421.8,
  421.7, 421, 421.3, 421, 421, 420.7, 420.6, 420.9, 421.4, 421.4,
  421
)), class = "data.frame", row.names = c(NA, -13L))
stefan
  • 90,330
  • 6
  • 25
  • 51
  • When I check the mode of the variable, it's still returned as numeric. And yes it still doesn't solve the plotting issue. When I remove the secondary axis from the script, it produces the plot but doesn't show river-level which is on the same axis as the well level. Do you think it would be better to use two different data frames and plot that way? – JackWassik Jul 13 '22 at 20:30
  • Under the hood a `Date` or a `Datetime` are numerics. Check `class(df$Date)` and you will see that it is of class `"POSIXct" "POSIXt"`. Without your plotting code it's hard to figure out what's the issue. But keep in mind that e.g. `scale_x_date` will not work with a `POSIXct` for which you need `scale_x_datetime`. – stefan Jul 13 '22 at 21:17
  • Thank you! scale_x_datetime was successful! I was able to separate them into two separate dataframes and the plot is working! – JackWassik Jul 14 '22 at 10:55