0

I'm trying to read an Excel file where a series of columns names are in date format, i.e.

|ID|Jan-21|Feb-21|Mar-21|etc|

These are held by Excel as numbers and get read in by

df <- readxl::read_excel("filename", sheet = "tab")

as

|ID|44197|44228|44256|etc|

I'd like to convert these back a date format and I've tried this

toDateString <- Vectorize(function(value) {
  number = as.numeric(value)
  if_else(!is.na(number) & number >= 44197 & number <= 44256)
    return(value)
  else
    return(format(number, "%b-%y")))
})

df2 <- df %>% rename_if(isDate, funs(toDateString))

But the new dataframe remains unchanged. I've checked the logic in toDateString and that works. I think its because the function is vectorising as I see the following warnings...

Warning messages:
1: In if (!is.na(number) & number >= 44197 & number <= 44256) return(TRUE) else return(FALSE) :
  the condition has length > 1 and only the first element will be used
2: In if (!is.na(number) & number >= 44197 & number <= 44256) return(TRUE) else return(FALSE) :
  the condition has length > 1 and only the first element will be used

Any ideas welcome...

Chris
  • 1,449
  • 1
  • 18
  • 39

1 Answers1

1

Try this function :

toDateString <- function(x) {
  inds <- grepl('^\\d+$', x)
  x[inds] <- format(as.Date(as.numeric(x[inds]), origin = '1899-12-30'), '%b-%y')
  x
}

df <- data.frame(ID = 1:3, '44197' = rnorm(3), check.names = FALSE)
names(df) <- toDateString(names(df))
df
#  ID Jan-21
#1  1   0.68
#2  2  -0.32
#3  3  -1.31
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213