0

I am reading an Excel file using the function readxl::read_excel(), but it appears that date are not getting read properly.

In the original file, one such date is 2020-JUL-13, but it is getting read as 44025.

Is there any way to get back the original date variable as in the original file?

Any pointer is very appreciated.

Thanks,

Brian Smith
  • 1,200
  • 4
  • 16

2 Answers2

3

Basically, you could try to use:

as.Date(44025)

However, you will notice error saying Error in as.Date.numeric(44025) : 'origin' must be supplied. And that means that all you need is to know origin, i.e. starting date from which to start counting. When you check, mentioned by Bappa Das, help page for convertToDate function, you will see that it is just a wrapper for as.Date() function and that the default argument for origin parameter is "1900-01-01".

Next, you can check, why is this, by looking for date systems in Excel and here is a page for this:

Date systems in Excel

Where is an information that for Windows (for Mac there are some exceptions) starting date is indeed "1900-01-01".

And now, finally, if you want to use base R, you can do:

as.Date(44025, origin = "1900-01-01")

This is vectorized function, so you can pass whole column as well.

gss
  • 1,334
  • 6
  • 11
1

You can use openxlsx package to convert number to date like

library(openxlsx)

convertToDate("44025")

Or to convert the whole column you can use

convertToDate(df$date)
UseR10085
  • 7,120
  • 3
  • 24
  • 54