-1

I have about 1000 lines of data recorded in this format: 2 Apr 2018, 23:14:17

And I have about another 1000 recorded like this: Mar 27, 2018, 10:42:26 PM

My first issue is that they are recognised by Excel as a character string (the original compiler of this document dumped a bunch of DateTimes in from somewhere else).

My second issue is that even after we successfully convert the format to be recognised as 'DateTime', I then need to change the format to DD:MM:YYYY hh:mm:ss in 24-hour rather than 12-hour time.

Your help with this or advice as to how I can rephrase the question is greatly appreciated. Tagging Excel, R and Python as these are the tools I have on-hand.

Davide Lorino
  • 875
  • 1
  • 9
  • 27
  • A lazy suggestion - you could try the [`dateparser`](https://pypi.python.org/pypi/dateparser) module for a quick fix. It will be slower than a tailored solution but it won't make a huge difference with your dataset size. Not at a PC to make a specific answer. – roganjosh Apr 10 '18 at 16:47

3 Answers3

1

In R, the package lubridate is what you need. Here is some example codes:

> require(lubridate)
# examples from the question statement.
> test_date_1 = "2 Apr 2018, 23:14:17"
> test_date_2 = "Mar 27, 2018, 10:42:26 PM"

# parse the date.
> date1 = dmy_hms(test_date_1)
> date2 = mdy_hms(test_date_2)

# show the date
> print(date1)
[1] "2018-04-02 23:14:17 UTC"
> print(date2)
[1] "2018-03-27 22:42:26 UTC"
Wenlong Liu
  • 444
  • 2
  • 13
0

In case you prefer plain R use strptime:

strptime("2 Apr 2018, 23:14:17", "%d %b %Y, %H:%M:%S")
[1] "2018-04-02 23:14:17 PDT"
strptime("Mar 27, 2018, 10:42:26 PM", "%b %d, %Y, %H:%M:%S")
[1] "2018-03-27 10:42:26 PDT"
Ott Toomet
  • 1,894
  • 15
  • 25
  • 1
    I really appreciate this thank you! I can only select one, and I cant 'upvote' with my level of experience, but I just wanted to say thank you for showing me another way of doing this. – Davide Lorino Apr 10 '18 at 17:03
0

The following example will give you a fair idea on how to fix your dates in R (with lubridate), if you have your 1000 records in a data frame. The time will also be converted to a 24 hr format.

Test data frame for demonstration:

df <- data.frame(date = rep("2 Apr 2018, 23:14:17", 10))
temp <- data.frame(date = rep("Mar 27, 2018, 10:42:26 PM", 10))
df <- rbind(df, temp)
df

Fix the dates:

library(lubridate)

dmy <- dmy_hms(df$date)
mdy <- mdy_hms(df$date) 

mdy[is.na(mdy)] <- dmy[is.na(mdy)]

df$date <- mdy

Refer the solution here for further information.

Deepak Rajendran
  • 358
  • 1
  • 11