I have a dataframe and I would like to convert the column types. I actualy have this function :
library(dplyr)
convertDfTypes <- function(obj, types) {
for (i in 1:length(obj)){
FUN <- switch(types[i], character = as.character,
numeric = as.numeric,
factor = as.factor,
integer = as.integer,
POSIXct = as.POSIXct,
datetime = as.POSIXct)
name <- names(obj)[i]
expr <- paste0("obj %<>% mutate(", name, " = FUN(", name, "))")
eval(parse(text = expr))
}
return(obj)
}
myDf <- data_frame(date = seq(Sys.Date() - 4, Sys.Date(), by = 1),
x = 1:5,
y = 6:10)
colTypes <- c("character", "character", "integer")
str(myDf)
# Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 5 obs. of 3 variables:
# $ date: Date, format: "2015-05-11" "2015-05-12" ...
# $ x : int 1 2 3 4 5
# $ y : int 6 7 8 9 10
myDf %>%
convertDfTypes(colTypes) %>%
str
# Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 5 obs. of 3 variables:
# $ date: chr "2015-05-11" "2015-05-12" "2015-05-13" "2015-05-14" ...
# $ x : chr "1" "2" "3" "4" ...
# $ y : int 6 7 8 9 10
(In a first time I used obj[,i] <- FUN(obj[,i])
but this is very unlikely to work with objects of class tbl
)
It works fine even if it's slow for complex types conversion (e.g. Date/datetime) on "larges" dataframes. But I don't know if using eval(parse
is a great idea for column replacement and I think the function can be improved without using a for
loop.
Is there a way to apply a different function to each column, like mutate_each
but using a different function for each column and not the same for all.
Do you have any ideas to improve the function ?
Thank you