0

i have a question for another problem, but i tried breaking it down to a simpler code, so that my initial problem is visible. I tried making a dataframe and a function computing the sum as follows:

df <- data.frame(x=c(1,2,3),y=c(1,2,3))

fun <- function(data,x,y){

    z <- sum(data$x) + sum(data$y)
    return(z)
}

fun(data= df,x = df$x,y = df$y)
[1] 12

The code gives me the expected sum 12. Changing the colnames of the df dataframe to e.g "r" and "t" returns 0, even if i specify the arguments in the function. What is wrong?

  df <- data.frame(r=c(1,2,3),t=c(1,2,3))

    fun <- function(data,x,y){

        z <- sum(data$x) + sum(data$y)
        return(z)
    }
    
    fun(data= df,x = df$r,y = df$t)
[1] 0

Thanks in advance.

Osiris
  • 11
  • 3
  • 2
    https://stackoverflow.com/questions/18222286/dynamically-select-data-frame-columns-using-and-a-character-value – Ronak Shah Aug 22 '20 at 13:43

2 Answers2

2

The issue is that the second function is trying to extract the variables x and y from the data. It is equivalent of doing data[["x"]] which is not your intention.

Instead, we could pass the variables as characters to get intended results and converting the extraction from data$x to data[[x]]:

df <- data.frame(r=c(1,2,3),t=c(1,2,3))

fun <- function(data,x,y){
  
  z <- sum(data[[x]]) + sum(data[[y]])
  return(z)
}

fun(data= df,x = "r",y = "t")
#> [1] 12

For this particular example, we could use a base approach and use with() that is pretty clean:

with(df, sum(r) + sum(t))
Cole
  • 11,130
  • 1
  • 9
  • 24
  • Thank you, that helped a lot, also for my other code. Actually, you can use single paranthesis [] in the function as well, right? – Osiris Aug 22 '20 at 15:49
  • I believe a single bracket on a ```data.frame``` returns a ```data.frame``` whereas a double bracket returns a vector. I am not at a computer to confirm. I think a vector is better for ```sum()``` – Cole Aug 22 '20 at 15:56
1

I would suggest next approach. It looks like your code when using $ looks for a dataframe. Instead you could try this:

#Data 1
df <- data.frame(x=c(1,2,3),y=c(1,2,3))

#Function

fun <- function(data,x,y){
  
  z <- sum(x) + sum(y)
  return(z)
}

#Apply

fun(data= df,x = df$x,y = df$y)

Output:

[1] 12

Now second example:

#Apply 2

df <- data.frame(r=c(1,2,3),t=c(1,2,3))

fun(data= df,x = df$r,y = df$t)

Output:

[1] 12
Duck
  • 39,058
  • 13
  • 42
  • 84
  • 1
    It looks like that the `data` argument in `fun()` is redundant because you don't call it anywhere. – Darren Tsai Aug 22 '20 at 14:15
  • @DarrenTsai Yes you are right, but it is better if you set things close as OP issue so that they understand. Your comment is pretty useful :) – Duck Aug 22 '20 at 14:23