0

I'm trying to write a function that has as an input argument, a column of dataframes that are called iteratively in the function.

Example shown below: Writing a function called iter, which has 2 inputs: 1) the list of dataframes 2) the name of the column that both df1 and df2 contains

iter <- function (dflist, columnname) {
  for (df in dflist){
      df[,bla:=cut(columnname, etc)]
      lm(...data=df)
      etc
  }
}

E.g: dflist = list(df1,df2) and df1 and df2 both contain a column called col1

I want to write a function such that when I type in iter(dflist,col1)

I get df[,bla:=cut(col1, etc)]

However, whenever I run it now, it gives this error - "object 'col1' not found.

I've tried passing in col1 as a list and use get(columnname), but to no avail:

iter <- function (dflist, columnname) {
  for (df in dflist){
      df[,bla:=cut(get(columnname), etc)]
      lm(...data=df)
      etc
  }
}

iter(dflist,'col1')

But I get the same error

Jason
  • 31
  • 1
  • 4

1 Answers1

0

Do you really need to have the unquoted columns? I find much easier to dynamically change columns having the columname argument as a single character string object. You can use as.symbol() –or as.name()– to create an object (sym in the function) that will later let you refer to said object as an R object –as opposed to its value, whatever is assigned to columname.

You can then use eval() in the regular data.table syntax to evaluate the sym object at its environmnet.

library(data.table)

dList <- list(
  mtcars,
  mtcars
)

dList <- lapply(dList, function(x) copy(as.data.table(x)))

iter <- function (dflist, columnname) {
  new_var <- paste0(columnname, "_sq") 
  sym <- as.symbol(columnname)
  for (df in dflist){
    df[,(new_var):= eval(sym)^2]
  }
}

iter(dList, "mpg")

Result...

> head(dList[[2]], 1)
   mpg cyl disp  hp drat   wt  qsec vs am gear carb mpg_sq
1:  21   6  160 110  3.9 2.62 16.46  0  1    4    4    441

Keep in mind that iter will alter the objects inside dList even though you didn't specify return in the function. The assignment operator := assigns whatever by reference. See this Q to a more thorough explanation. If you wish to change the object inside the function "but outside" and return a list with the resulting objects, you need to use data.table::copy first:

iter <- function (dflist, columnname) {
  new_var <- paste0(columnname, "_sq")
  res <- vector("list", length(dflist))
  sym <- as.symbol(columnname)
  for (i in seq_along(res)){
    dt <- copy(dflist[[i]])
    res[[i]] <- dt[,(new_var):= eval(sym)^2]
  }
  return(res)
}
JdeMello
  • 1,708
  • 15
  • 23