1

I am trying to understand how to pass a data frame to an R function. I found an answer to this question on StackOverflow that provides the following demonstration / solution:

Pass a data.frame column name to a function

df <- data.frame(A=1:10, B=2:11, C=3:12)
fun1 <- function(x, column){
  max(x[,column])
}

fun1(df, "B")
fun1(df, c("B","A"))

This makes sense to me, but I don't quit understand the rules for calling data frames within a function. Take the following example:

data(iris)
x.test <- function(df, x){ 
  out <- with(df, mean(x))
  return(out)
}    
x.test(iris, "Sepal.Length")

The output of this is NA, with a warning message. But, if I do the same procedure without the function it seems to work just fine.

with(iris, mean(Sepal.Length))

I'm obviously missing something here -- any help would be greatly appreciated.

Thanks!

Community
  • 1
  • 1
Brian P
  • 1,496
  • 4
  • 25
  • 38
  • Note that you're passing Sepal.Length in as characters. The equivalent call would be `with(iris, mean("Sepal.Length"))` – Dason May 03 '14 at 15:04
  • @Dason: True, they aren't equivalent, so maybe that wasn't my best phrasing. I guess I am trying to have the function return the same thing as with(iris, mean(Sepal.Length)). When I remove the quotation marks, the function returns an error: "object 'Sepal.Length' not found. – Brian P May 03 '14 at 15:19
  • Why I changed the title: The problem was not in the passage of either the dataframe or the character vector, but with the evaluation inside the function using `with`. – IRTFM May 03 '14 at 16:26

2 Answers2

1

You have been given the correct advice already (which was to use "[" or "[[" rather than with inside functions) but it might also be helpful to ponder why the problem occurred. Inside the with you asked the mean function to return the mean of a character vector, so NA was the result. When you used with at the interactive level, you had no quotes around the character name of the column and if you had you would have gotten the same result:

> with(iris, mean('Sepal.Length'))
[1] NA
Warning message:
In mean.default("Sepal.Length") :
  argument is not numeric or logical: returning NA

If you had used the R get mechanism for "promoting" a character object to return the result of a named object you would actually have succeeded, although with is still generally not recommended for programming use:

x.test <- function(df, x){ 
   out <- with(df, mean( get(x)) )    #  get() retrieves the named object from the workspace
   return(out)
 }    
 x.test(iris, "Sepal.Length")
#[1] 5.843333

See the Details section of the ?with page for warnings about its use in functions.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Thanks much for the explanation. A solution was provided, but now I get why `with` was a problem. – Brian P May 03 '14 at 23:36
0

This will work

data(iris)
x.test <- function(df, x){ 
  out <- mean(df[, x])
  return(out)
}    
x.test(iris, "Sepal.Length")

Your code is trying to take mean("Sepal.Length") which is clearly not what you want.

James King
  • 6,229
  • 3
  • 25
  • 40