0

I am working an uploaded document originally from google docs downloaded to an xlsx file. This data has been hand entered & formatted to be DD-MM-YY, however this data has uploaded inconsistently (see example below). I've tried a few different things (kicking myself for not saving the code) and it left me with just removing the incorrectly formatted dates.

Any suggestions for fixing this in excel or (preferably) in R? This is longitudinal data so it would be frustrating to have to go back into every excel sheet to update. Thanks!

data <- read_excel("DescriptiveStats.xlsx")

ex:

22/04/13
43168.0

Phil
  • 7,287
  • 3
  • 36
  • 66

1 Answers1

0

43168.0

is a correct date value

22/04/13

is not a valid date. it is a text string. to convert it into date you will need to change it into 04/13/2022

there are a few options. one is to change the locale so the 22/04/13 would be valid. see more over here: locale differences in google sheets (documentation missing pages)

2nd option is to use regex to convert it. see examples:

however, it is very likely that 43168 is also not the correct date. if your date before import was 01/02/2022 then after import it could be: 44563 which is actually 02/01/2022 so be careful. you can check it with:

=TO_DATE(43168)

and date can be checked with:

=ISDATE("22/04/13")
player0
  • 124,011
  • 12
  • 67
  • 124
  • I see what you're saying but 22/04/13 (DD-MM-YY) is a correct day (April, 22 2013). Not sure if that changes anything for the code you recommend. However, the integer maybe wrong if R's default is MM-DD-YY, so going into the local file might be needed unless there is a way to change the default this. Thanks for your input! – Brooke Lamere Oct 19 '22 at 16:49
  • @BrookeLamere can you share a copy /sample of your sheet with example of desired output? – player0 Oct 19 '22 at 16:51