1

folks...

I am having trouble with date/time showing up properly in lubridate.

Here's my code:

Temp.dat <- read_excel("Temperature Data.xlsx", sheet = "Sheet1", na="NA") %>%
  mutate(Treatment = as.factor(Treatment),
         TempC=as.factor(TempC),
         TempF=as.factor(TempF),
         Month=as.factor(Month),
         Day=as.factor(Day),
         Year=as.factor(Year),
         Time=as.factor(Time))%>%
 select(TempC, Treatment, Month, Day, Year, Time)%>%
 mutate(Measurement=make_datetime(Month, Day, Year, Time))

Here's what it spits out:

tibble [44 x 7] (S3: tbl_df/tbl/data.frame)
 $ TempC      : Factor w/ 38 levels "15.5555555555556",..: 31 32 29 20 17 28 27 26 23 24 ...
 $ Treatment  : Factor w/ 2 levels "Grass","Soil": 1 1 1 1 2 2 2 2 2 2 ...
 $ Month      : Factor w/ 1 level "6": 1 1 1 1 1 1 1 1 1 1 ...
 $ Day        : Factor w/ 2 levels "15","16": 1 1 1 1 1 1 1 1 1 1 ...
 $ Year       : Factor w/ 1 level "2022": 1 1 1 1 1 1 1 1 1 1 ...
 $ Time       : Factor w/ 3 levels "700","1200","1600": 3 3 3 3 3 3 3 3 3 3 ...
 **$ Measurement: POSIXct[1:44], format: "0001-01-01 03:00:00" "0001-01-01 03:00:00" "0001-01-01 03:00:00" "0001-01-01 03:00:00" ...**

I've put asterisks by the problem result. It should spit out June 16th at 0700 or something like that, but instead it's defaulting to January 01, 1AD for some reason. I've tried adding colons to the date in excel, but that defaults to a 12-hour timecycle and I'd like to keep this at 24 hours.

What's going on here?

  • Factoring seconds and temperature does not seem logical to me. Factoring a continuous variable will basically make a level for each entry, which as you see in TempC, you have 38 levels. Try not factoring temps and time. Import time as date, specifying the format you have in the excel file. – guasi Jun 17 '22 at 14:55
  • Thanks for your answer! Would you mind expanding on this a little bit? I'm not sure how my temperature data would affect my date/time data. I'm also not using seconds in my measurements, just hours. – Joe Ballenger Jun 17 '22 at 15:03
  • Sure, temp does not affect the problem of time. It was just an observation. `make_datetime()` is expecting `make_datetime(year = 1999, month = 12, day = 22, sec = 10)`. Grab the time portion of the `Time` that contains the time and split it into hours, minutes, and seconds. I don't know how your time data is coming in to help you split it, if you post a row of the data before you transform it (as it comes when you read it with `read_excel()`) I can write a proper answer. – guasi Jun 17 '22 at 16:21
  • I know you're not using seconds... – guasi Jun 17 '22 at 16:27
  • Here's how it's coming in, using the code before the pipes. I'm not sure how to post a comment so it codes properly unfortunately. `tibble [44 x 10] (S3: tbl_df/tbl/data.frame) $ Stake : chr [1:44] "4r" "10r" "16r" "22r" ... $ Treatment: Factor w/ 2 levels "Grass","Soil": 1 1 1 1 2 2 2 2 2 2 ... $ Month : int [1:44] 6 6 6 6 6 6 6 6 6 6 ... $ Day : int [1:44] 15 15 15 15 15 15 15 15 15 15 ...` – Joe Ballenger Jun 17 '22 at 18:37
  • The info for `Time` got cut off, can you edit your post and include the contents of your data? You can store the output of the file `tmp <- read_excel( .. )` and then post the first few rows `head(tmp)`. You can also post `str(tmp$Time)` to see the type. – guasi Jun 17 '22 at 19:21
  • Here is str(tmp$Time) `int [1:44] 1600 1600 1600 1600 1600 1600 1600 1600 1600 1600 ...` – Joe Ballenger Jun 17 '22 at 19:47
  • Here are the first few rows of Head(tmp) `Year Month Day Time TempC Treatment Measurement 2022 6 15 1600 31.5 Grass 2022-08-20 16:00:00 2022 6 15 1600 31.7 Grass 2022-08-20 16:00:00` – Joe Ballenger Jun 17 '22 at 19:53
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/245706/discussion-between-guasi-and-joe-ballenger). – guasi Jun 17 '22 at 21:40

1 Answers1

1

This will work as long as the format in the excel file for date is set to time, and it imports as a date-time object that lubridate can interpret.

library(dplyr)
library(lubridate)
Temp.dat <- read_excel("t.xlsx", sheet = "Sheet1", na="NA") %>%
  mutate(Treatment = as.factor(Treatment),
         TempC = as.numeric(TempC),
         TempF = as.numeric(TempF),
         Month = as.numeric(Month),
         Day = as.numeric(Day),
         Year = as.numeric(Year),
         Hour = hour(Time),
         Minute = minute(Time)) %>%
  select(TempC, Treatment, Month, Day, Year, Hour, Minute) %>%
  mutate(Measurement = make_datetime(year = Year, 
                                     month = Month, 
                                     day = Day, 
                                     hour = Hour,
                                     min = Minute))

Notice the value for the arguments for make_datetime() are set to numeric, which is what the function expects. If you pass factors, the function gives you the weird dates you were seeing.

No need to convert Time to string and extract hours and minutes, as I suggested in the comments, since you can use lubridate's minute() and hour() functions.

EDIT In order to be able to use lubridate's functions Time needs to be a date-time object. You can check that it is by looking at what read_excel() produces

> str(read_excel("t.xlsx", sheet = "Sheet1", na="NA"))
tibble [2 × 7] (S3: tbl_df/tbl/data.frame)                                                     
 $ Treatment: chr [1:2] "s" "c"
 $ TempC    : num [1:2] 34 23
 $ TempF    : num [1:2] 99 60
 $ Month    : num [1:2] 5 4
 $ Day      : num [1:2] 1 15
 $ Year     : num [1:2] 2020 2021
 $ Time     : POSIXct[1:2], format: "1899-12-31 04:33:23" "1899-12-31 03:20:23"

See that Time is type POSIXct, a date-time object. If it is not, then you need to convert it into one if you want to use lubridate's minute() and hour() functions. If it cannot be converted, there are other solutions, but they depend on what you have.

guasi
  • 1,461
  • 3
  • 12
  • Thanks so much for this! I'm getting the following error code: `Error: Problem with `mutate()` column `Hour`. i `Hour = hour(Time)`. x 'origin' must be supplied` What is this origin referring to? – Joe Ballenger Jun 17 '22 at 21:34
  • It's possible `Time` is not formatted as a date-time object, but it's hard to tell without seeing the structure and contents of the data produced by `read_excel()` **before** it gets passed on to `mutate()`. You've tried to post it in the comments, but it gets cut off and it's really hard to read. You can **edit** your post instead. This [post](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) explains how to grab and post data. – guasi Jun 17 '22 at 21:50
  • I went back the comments, and I see that `Time` is coming in as an `int [1:44] 1600`. What's this 1600? minutes? – guasi Jun 17 '22 at 22:24
  • I eventually got this to work. The issue was that I had a column in my data for time, when I should have split it into hours and minutes. `$ TempC : num [1:88] 31.5 31.7 31.2 27.2 26.4 30.7 30.3 29.3 28.4 28.6 ... $ Treatment : Factor w/ 2 levels "Grass","Soil": 1 1 1 1 2 2 2 2 2 2 ... $ Month : int [1:88] 6 6 6 6 6 6 6 6 6 6 ... $ Day : int [1:88] 15 15 15 15 15 15 15 15 15 15 ... $ Year : int [1:88] 2022 2022 2022 2022 2022 2022 2022 2022 2022 2022 ... $ Hour : int [1:88] 16 16 16 16 16 16 16 16 16 16` – Joe Ballenger Jun 21 '22 at 01:00