0

I'm trying to update a data frame column inside a function based on a filtered column.

#example dataframe
my.df = data.frame(A=1:10)

#define function to classify column passed as argument 2 based on argument 3
classify = function(df, col, threshold){
  df[df$col<threshold, 2] <- "low"
  df[df$col>=threshold, 2] <- "high"

  return(df)
}

#assign output to new.df
new.df = classify(my.df, A, 5)

I'd expect the new column to contain character values of 'low' or 'high', but instead they're all <NA>.

alaybourn
  • 304
  • 2
  • 12

2 Answers2

1

Simply pass the string literal of the column name, "A", and then inside the function receive the parameter with single or double bracket [[...]] indexing and not with $:

# example dataframe
my.df = data.frame(A=1:10)

# define function to classify column passed as argument 2 based on argument 3
classify = function(df, col, threshold){
  df[df[[col]] < threshold, 2] <- "low"
  df[df[[col]] >= threshold, 2] <- "high"

  return(df)
}

# assign output to new.df
new.df = classify(my.df, "A", 5)

new.df    
#     A   V2
# 1   1  low
# 2   2  low
# 3   3  low
# 4   4  low
# 5   5 high
# 6   6 high
# 7   7 high
# 8   8 high
# 9   9 high
# 10 10 high
Parfait
  • 104,375
  • 17
  • 94
  • 125
0

We can use the devel version of dplyr (soon to be released 0.6.0) to do this. The enquo takes the input argument and converts it to quosure which will be evaluated within the mutate/group_by/filter etc by unquoting (UQ)

library(dplyr)
classify <- function(df, col, threshold){
   col <- enquo(col)

   df %>%
       mutate(categ = ifelse(UQ(col) < threshold, "low", "high"))

}

classify(my.df, A, 5)
#    A categ
#1   1   low
#2   2   low
#3   3   low
#4   4   low
#5   5  high
#6   6  high
#7   7  high
#8   8  high
#9   9  high
#10 10  high
akrun
  • 874,273
  • 37
  • 540
  • 662