Since there are two spaces in the time stamp, you may want to split by "\\s(?=\\d)"
, space followed by a digit, in order to get two pieces instead of three.
df |>
tidyr::separate(ActivityHour, c("ActivityDay"," ActivityHour_sep"), sep="\\s(?=\\d)")
# ActivityDay ActivityHour_sep X
# 1 04/10/2023 12:32:25 AM 0.9148060
# 2 04/10/2023 12:32:26 AM 0.9370754
# 3 04/10/2023 12:32:27 AM 0.2861395
# 4 04/10/2023 12:32:28 AM 0.8304476
# 5 04/10/2023 12:32:29 AM 0.6417455
# 6 04/10/2023 12:32:30 AM 0.5190959
# 7 04/10/2023 12:32:31 AM 0.7365883
# 8 04/10/2023 12:32:32 AM 0.1346666
# 9 04/10/2023 12:32:33 AM 0.6569923
# 10 04/10/2023 12:32:34 AM 0.7050648
# 11 04/10/2023 12:32:35 AM 0.4577418
Base R solution (about 20% faster):
do.call(rbind.data.frame, strsplit(df$ActivityHour, "\\s(?=\\d)", perl=TRUE)) |>
setNames(c("ActivityDay"," ActivityHour_sep")) |>
cbind(df[setdiff(names(df), 'ActivityHour')])
# ActivityDay ActivityHour_sep X
# 1 04/10/2023 12:32:25 AM 0.9148060
# 2 04/10/2023 12:32:26 AM 0.9370754
# 3 04/10/2023 12:32:27 AM 0.2861395
# 4 04/10/2023 12:32:28 AM 0.8304476
# 5 04/10/2023 12:32:29 AM 0.6417455
# 6 04/10/2023 12:32:30 AM 0.5190959
# 7 04/10/2023 12:32:31 AM 0.7365883
# 8 04/10/2023 12:32:32 AM 0.1346666
# 9 04/10/2023 12:32:33 AM 0.6569923
# 10 04/10/2023 12:32:34 AM 0.7050648
# 11 04/10/2023 12:32:35 AM 0.4577418
However, it might be better to use the POSIX
format instead of time in character format so that you can perform time calculations. You can use strptime
for that.
df <- transform(df, ActivityHour=strptime(ActivityHour, '%m/%d/%Y %H:%M:%S %p', tz='GMT'))
df
# ActivityHour X
# 1 2023-04-10 00:32:25 0.9148060
# 2 2023-04-10 00:32:26 0.9370754
# 3 2023-04-10 00:32:27 0.2861395
# 4 2023-04-10 00:32:28 0.8304476
# 5 2023-04-10 00:32:29 0.6417455
# 6 2023-04-10 00:32:30 0.5190959
# 7 2023-04-10 00:32:31 0.7365883
# 8 2023-04-10 00:32:32 0.1346666
# 9 2023-04-10 00:32:33 0.6569923
# 10 2023-04-10 00:32:34 0.7050648
# 11 2023-04-10 00:32:35 0.4577418
where
class(df$ActivityHour)
# [1] "POSIXlt" "POSIXt"
Data:
df <- structure(list(ActivityHour = c("04/10/2023 12:32:25 AM", "04/10/2023 12:32:26 AM",
"04/10/2023 12:32:27 AM", "04/10/2023 12:32:28 AM", "04/10/2023 12:32:29 AM",
"04/10/2023 12:32:30 AM", "04/10/2023 12:32:31 AM", "04/10/2023 12:32:32 AM",
"04/10/2023 12:32:33 AM", "04/10/2023 12:32:34 AM", "04/10/2023 12:32:35 AM"
), X = c(0.914806043496355, 0.937075413297862, 0.286139534786344,
0.830447626067325, 0.641745518893003, 0.519095949130133, 0.736588314641267,
0.13466659723781, 0.656992290401831, 0.705064784036949, 0.45774177624844
)), row.names = c(NA, -11L), class = "data.frame")