0

I've tried everything in this thread as.Date returning NA while converting from 'ddmmmyyyy' to try and sort my problem.

I'm using these commands to turn a factor into a date:

cohort$doi <- as.Date(cohort$doi, format= "%Y/%m/%d")

All my dates are currently in the format: YYYY-MM-DD, so as far as I'm aware the above should work

I used this code yesterday to convert all my dates for various variables from a factor to a date. It worked yesterday and everything was fine. Today I opened my script and imported in my data, ran this command and viewed my data but all of the dates now say NA.

I've tried everything from previous threads (I looked at a few more than just the one I linked above) but nothing has so far worked. I'm not sure what to do now

Example of what doi column looks like:

1970-01-01

1970-02-02

1970-03-03

1970-04-04

The column is currently classed as an factor. And when I do the code I used above, the column is defined as a date but all the dates now say NA Other than closing R and opening it up again for today, I've done nothing else.

Eams
  • 59
  • 7
  • Please post a sample of your data `cohort$doi` and corresponding code that replicates your problem. It's unlikely that something is different about your code from yesterday to today, but rather you cleared your environment by restarting your program. Read this SO question on [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – LMc Mar 12 '21 at 15:58
  • @LMc I've just added a sample of what the column would look like and the code is already mentioned above. I'm guessing it must be something I cleared but I can't work out what it could be as I've read in my code the same way – Eams Mar 12 '21 at 16:04
  • If what you've posted is what you're data look like then `class(cohort$doi)` couldn't be numeric. It would either be date or character. – LMc Mar 12 '21 at 16:08
  • You have also specified the `format` argument incorrectly, which is why you're getting `NA`. – LMc Mar 12 '21 at 16:08
  • My data does look like that (just different dates) and R says that it is a factor. How should the format be then? – Eams Mar 12 '21 at 16:12

1 Answers1

0

If you read the documentation for as.Date you will note the default format is %Y-%d-%m or %Y/%d/%m:

The default formats follow the rules of the ISO 8601 international standard which expresses a day as "2001-02-03".

In your code you have specified your dates are formatted by slashes, but your sample data shows they are formatted in the default format used by as.Date:

doi <- as.factor(c("1970-01-01",
                   "1970-02-02",
                   "1970-03-03",
                   "1970-04-04"))

as.Date(doi) # default format %Y-%m-%d
[1] "1970-01-01" "1970-02-02" "1970-03-03" "1970-04-04"

as.Date(doi, format = "%Y/%m/%d") # incorrect specification of your date format
[1] NA NA NA NA

as.Date("1970/01/01") # also a default format
[1] "1970-01-01"

Note: as.Date accepts character strings, factors, logical NA and objects of classes "POSIXlt" and "POSIXct".

LMc
  • 12,577
  • 3
  • 31
  • 43
  • So if a date is already in the correct default format, I can ignore the format= part? – Eams Mar 12 '21 at 16:22
  • Yes, that's correct. It has to be that format otherwise you have to correctly specify it. Otherwise, there are other packages like `lubridate` that has the function [guess_format](https://lubridate.tidyverse.org/reference/guess_formats.html). – LMc Mar 12 '21 at 17:53