0

I am new into the R world. So, I have a data frame that have two date columns that comprises of date and time, but stored as a character, so I want to use the difftime() function to make a column that could store in seconds the time between the two date columns. I tried changing the data type of the date column(start_time) from character to a date-time format severally, using different date-time functions,, just so I could be able use the difftime() function on it, but to no avail . Please, help.

Below I uploaded an image of the table and pasted the codes I used together with the errors they've returned. where start_time and end_time are the date columns.

enter image description here

1

all_trips3 <- as.POSIXlt(all_trips3$start_time)

got this error after I ran it "Error in as.POSIXlt.character(all_trips3$start_time) : character string is not in a standard unambiguous format"

2

all_trips3$start_time <- (strptime(all_trips3$start_time, "%m/%d/%y")

It ran but made date column(start_time) into NA values

3

 all_trips3$start_time <- as.POSIXct(all_trips3$start_time, "%m/%d/%y %H:%M:%S", tz = Sys.timezone())

ran too, but it made the date(start_time) column into NA values

4. saw this here on stackflow while looking for solutions

all_trips3 <- all_trips3 %>%
  mutate(across(c(start_time, end_time), as.POSIXct,
                format = "%m/%d/%y %H:%M"))

ran too, but had my date columns filled with NA values

Phil
  • 7,287
  • 3
  • 36
  • 66

1 Answers1

0

You were close with strptime!

Consider this small example data frame.

df
#   X1 X2 X3     start_time        stop_time X5
# 1  0  0  0 2/19/2022 5:36 11/19/2022 15:36  0
# 2  0  0  0 2/19/2022 5:36 11/19/2022 15:36  0
# 3  0  0  0 2/19/2022 5:36 11/19/2022 15:36  0

Then identify your time columns,

tc <- grep('start_time|stop_time', names(df))

and

df[tc] <- lapply(df[tc], strptime, '%m/%d/%Y %H:%M')

df
#   X1 X2 X3          start_time           stop_time X5
# 1  0  0  0 2022-02-19 05:36:00 2022-11-19 15:36:00  0
# 2  0  0  0 2022-02-19 05:36:00 2022-11-19 15:36:00  0
# 3  0  0  0 2022-02-19 05:36:00 2022-11-19 15:36:00  0

Data:

df <- structure(list(X1 = c(0, 0, 0), X2 = c(0, 0, 0), X3 = c(0, 0, 
0), start_time = c("2/19/2022 5:36", "2/19/2022 5:36", "2/19/2022 5:36"
), stop_time = c("11/19/2022 15:36", "11/19/2022 15:36", "11/19/2022 15:36"
), X5 = c(0, 0, 0)), class = "data.frame", row.names = c(NA, 
-3L))
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • It ran without any error, but the columns are filled with the NA values, same with other functions I have tried. – Zainab Zanna Feb 26 '23 at 13:32
  • @ZainabZanna Then make it reproducible, good luck! – jay.sf Feb 26 '23 at 13:34
  • I tried again by dropping the data frame I was using, I usually save tables in a new data frame whenever I make significant changes to it, so I could easily go back to it when necessary. So, I ran the two codes of line you provided, and it worked!! Thank you much. @jay.sf – Zainab Zanna Feb 26 '23 at 14:51
  • I would really appreciate if you will help explain to me how the codes exactly work, and what each line or function does. Thank you. – Zainab Zanna Feb 26 '23 at 14:57
  • @ZainabZanna Glad it worked for you. `grep` is actually just another way to find the time columns you could also do `tc <- c("start_time", "stop_time")`. You can easily look up the documentation how a function works by typing `?lapply` or `?strptime`. I even recommend you to do that with every new function you are using, if you want to learn R thoroughly. Also to play through and try to understand the examples given there, and you'll quickly become a great R programmer. :) – jay.sf Feb 26 '23 at 15:28