0

I have dataframes in which one column has to suffer a modification, handling correctly NAs, characters and digits. Dataframes have similar names, and the column of interest is shared. I made a for loop to change every row of the column of interest correctly. However I had to create an intermediary object "df" in order to accomplish that. Is that necessary? or the original dataframes can be modified directly.

sheet1 <- read.table(text="
data
  15448
  something_else
  15334
  14477", header=TRUE, stringsAsFactors=FALSE)
sheet2 <- read.table(text="
data
  16448
  NA
  16477", header=TRUE, stringsAsFactors=FALSE)

sheets<-ls()[grep("sheet",ls())]

for(i in 1:length(sheets) ) {
  df<-NULL
  df<-eval(parse(text = paste0("sheet",i) ))  
  for (y in 1:length(df$data) ){
    if(!is.na(as.integer(df$data[y]))) 
    {
    df[["data"]][y]<-as.character(as.Date(as.integer(df$data[y]), origin = "1899-12-30"))
    }
  }
  assign(eval(as.character(paste0("sheet",i))),df)
}
Ferroao
  • 3,042
  • 28
  • 53

1 Answers1

1

As @d.b. mentions, consider interacting on a list of dataframes especially if similarly structured since you can run same operations using apply procedures plus you save on managing many objects in global environment. Also, consider using the vectorized ifelse to update column.

And if ever you really need separate dataframe objects use list2env to convert each element to separate object. Below wraps as.* functions with suppressWarnings since you do want to return NA.

sheetList <- mget(ls(pattern = "sheet[0-9]"))

sheetList <- lapply(sheetList, function(df) {
     df$data <- ifelse(is.na(suppressWarnings(as.integer(df$data))), df$data, 
                       as.character(suppressWarnings(as.Date(as.integer(df$data), 
                                                     origin = "1899-12-30"))))  
     return(df)
})

list2env(sheetList, envir=.GlobalEnv)
Community
  • 1
  • 1
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • What do you need the character string to be? Columns in dataframes are atomic vectors, meaning you cannot mix different types. Either it is a date column or string column. – Parfait Feb 11 '17 at 19:01
  • See update. Very simple change of the `NA` to `df$data` to keep existing data value. Curious, what type of analysis do you hope to run with such a mixed fields of strings and dates? – Parfait Feb 11 '17 at 20:06