Unable to convert character to Datetime

Question

This isn't duplicate. I already referred these posts 1,2,3

I have a dataframe like below:

t_df <- data.frame("SN" = c("AabcC123","ABC123","ABC123","MNO098","MNO098","MNO098"), 
                      "code" = c("ABC1111","DEF222","GHI133","","MNO1123","MNO567"), 
                      "d_time" = c("21/12/2012 12:12","30/06/2019 11:11","20/02/2027 10:10","22/11/2327 09:09:11","5/2/1927 08:08:12",""))
t_df <- data.frame(lapply(t_df, as.character), stringsAsFactors=FALSE)

While I read the csv file, all my dates are of character type.

So when I try to sort the dataframe, it only works based on first characters.

I would like to convert this to Datetime format.

Though I tried the below options based on other posts that I linked above, nothing helped. It returns NA

as_datetime(t_df$d_time, "%d/%m/%Y %H:%M:%S")  

as.POSIXct(t_df$d_time,format="%d/%m/%Y %H:%M:%S")

dmy_hms(t_df$d_time)

Can you help me convert the datatype to datetime?

Why are you also defining time? You only have dates there. Try `as.POSIXct("21/12/2012", format = '%d/%m/%Y')` — Sotos, Nov 01 '19 at 09:03
@Sotos - your code returns `NA` for my data or even for your example `21/12/2012` when I use `as.POSIXct("21/12/2012 12:12", format = '%d/%m/%Y %H:%M:%S') — The Great, Nov 01 '19 at 09:46
It gives `NA` if my data only contains `H` and `M`. But my data can contain both mix `H:M` or `H:M:S` as shown in sample — The Great, Nov 01 '19 at 09:51
It gives NA because you are also defining seconds but you only have hours:minutes (12:12)... — Sotos, Nov 01 '19 at 09:54
Yes, but how do I make sure it returns whatever time component it has. — The Great, Nov 01 '19 at 09:54

score 2 · Accepted Answer · answered Nov 01 '19 at 10:04

2

An idea is to count the colons (:) and if there is only 1, then add seconds as :00, i.e

library(stringr) #for str_count()

t_df$d_time[t_df$d_time != ''] <- ifelse(str_count(t_df$d_time[t_df$d_time != ''], ':') < 2, 
                                               paste0(t_df$d_time[t_df$d_time != ''], ':00'), 
                                                               t_df$d_time[t_df$d_time != ''])

as.POSIXct(t_df$d_time, format = "%d/%m/%Y %H:%M:%S")
#[1] "2012-12-21 12:12:00 EET" "2019-06-30 11:11:00 +03" "2027-02-20 10:10:00 +03" "2327-11-22 09:09:11 +03" "1927-02-05 08:08:12 EET" NA

answered Nov 01 '19 at 10:04

Sotos

51,121
6
32
66

1

Thank you. UPVOTED for the effort and time. marked as answer – The Great Nov 01 '19 at 10:05
If I just stick to `H:M` irrespective of what is present in my data, I will just get zeroes for missing component. Is that how this works? – The Great Nov 01 '19 at 10:07
If you define the format with just hours:minutes, then If the date has seconds you will get NA. You must have 1 uniform format to be able to convert. Actually, I think there are packages with time functions that accept multiple formats – Sotos Nov 01 '19 at 10:10
One quick question, when I use like this, `as.POSIXct("2220-04-27 00:10",format="%Y-%m-%d %H:%M") and `as.Date("2220-04-27 00:10",format = "%Y-%m-%d")`, it works fine here. But in my real dataframe with million records, all `dates` have a difference of `1` day. For example, `date` value is 2220-04-26` and datetime is `2220-04-27 00:10`. This is happening for whole dataframe. can you let me know what can be the issue? – The Great Nov 01 '19 at 10:44
Does not sound right. I am not sure why but there are `time zone` and `origin` arguments in `as.POSIXct()` which might help you tackle that problem – Sotos Nov 01 '19 at 10:48

score 1 · Answer 2 · answered Nov 01 '19 at 09:04

You have to adjust the format to the format in your data which is %d/%m/%Y. For example in the first row you have 21/12/2012 which is day/month/year. In R the short for day is %d the short for month is %m and the short for a four digit year is %Y. The slashes / in between refer to the slashes in your string.

Therefore

t_df <- data.frame("SN" = c("AabcC123","ABC123","ABC123","MNO098","MNO098","MNO098"), 
                   "code" = c("ABC1111","DEF222","GHI133","","MNO1123","MNO567"), 
                   "d_time" = c("21/12/2012","30/06/2019","20/02/2027","22/11/2327","5/2/1927",""),
                   stringsAsFactors = FALSE)

t_df$d_time <- as.POSIXct(t_df$d_time, format = "%d/%m/%Y")

Note that most of the time when creating a data.frame it is better to use the option stringsAsFactors = FALSE as I did.

Unable to convert character to Datetime

2 Answers2