1

I would like to convert a number of columns in my df from factor to character. I have written the code to do so as follows:

ColumnsToStrings <- c(2,5,6,25)

  for (column in ColumnsToStrings){
    df[column] <- lapply(df[column], as.character)

  }

I would like to reuse this code so I tried to convert it into a function that accepts two arguments. A df and the vector of columns that you want to convert:

ConvertColumnToString <- function (df, VectorOfColumns) {

  for (column in VectorOfColumns){
    df[column] <- lapply(df[column], as.character)

  }
}

and I call it as follows:

ColumnsToStrings <- c(2,5,6,25)    
df <- ConvertColumnToString(df,ColumnsToStrings)

However when I call this function all the values in the df get deleted.

PaulBarr
  • 919
  • 6
  • 19
  • 33

1 Answers1

0

The loop appears to be redundant in your function, just do:

df <- mtcars[1:6]  # example, all numeric

ConvertColumnToString <- function(df, ColumnsToStrings) {
  df[ColumnsToStrings] <- lapply(df[ColumnsToStrings], as.character)
  return(df)
}

ColumnsToStrings <- c(2, 5, 6, 25)

res <- ConvertColumnToString(df, ColumnsToStrings)
sapply(res, class)
#       mpg         cyl        disp          hp        drat          wt 
# "numeric" "character"   "numeric"   "numeric" "character" "character" 

Edit: Your version didn't work because it didn't return anything and overwrote your data with nothing :) , You just would have added a return to get it to work:

ConvertColumnToString <- function (df, VectorOfColumns) {
  for (column in VectorOfColumns ) {
    df[column] <- lapply(df[column], as.character)
  }
  return(df)
}
res2 <- ConvertColumnToStringV1(df, ColumnsToStrings)
sapply(res2, class)
#       mpg         cyl        disp          hp        drat          wt 
# "numeric" "character"   "numeric"   "numeric" "character" "character" 
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • Out of interest why was my version deleting the data? – PaulBarr Jan 15 '20 at 14:01
  • @PaulBarr Sure, see edit. I also wrapped a `return` around the output of my version (although not absolutely necessary) to make it clear. – jay.sf Jan 15 '20 at 14:35
  • Amazing thank you. Why was it not necessary in your example but was necessary in mine? – PaulBarr Jan 15 '20 at 14:36
  • @PaulBarr It is necessary, see [edit history](https://stackoverflow.com/posts/59751524/revisions), what I meant it's also possible to write just `df` instead of `return(df)` at this position. See [this discussion](https://stackoverflow.com/a/11834490/6574038). – jay.sf Jan 15 '20 at 14:40