-1

I am writing an R function that takes a data frame and variable name as inputs, performs some operations on the vector in the data frame corresponding to the input variable name, then returns the resulting vector.

While I can write this function using [] to index, I am trying to learn how to index with $ instead (I understand this may be a bad idea).

My best guess is that I need to paste together what I want as a string and somehow use parse(), eval(), substitute() or some other function, but I suspect there might be a better way. Any help is appreciated.

# Create an arbitrary data frame
df <- data.frame(x = c("a","b","c","c","c"),
                 y = 1:5,
                 z = c(1,9,NA,0,NA))

# Create a vector with character "y",
# capturing the name of the column
# I want to work with in my data
M <- "y"

# Write a function that takes a 
# data frame and a variable name,
# adds 5 to each value of that 
# variable in the data frame, then
# prints the resulting numeric vector.
# Below produces the desired output
# of the function:
print(df$y + 5)

#####################################
# Define function to add 5 to a specified
# variable in the data frame using [] indexing
fun1 <- function(dat, var) {
    df[ ,var] + 5
}

# Works with both quoted values and
# objects assigned quoted values
fun1(dat = df, var = "y")
fun1(dat = df, var = M)

# However, doesn't work when I use 
# $ instead of []. See function below
# and corresponding results.
fun2 <- function(dat, var) {
  df$var + 5
}

# Doesn't produce intended result
fun2(dat = df, var = "y")
fun2(dat = df, var = M)
socialscientist
  • 3,759
  • 5
  • 23
  • 58
  • 1
    You should use `[` or `[[` to subset dataframe with variable names. `$` should not be used. See https://stackoverflow.com/questions/18222286/dynamically-select-data-frame-columns-using-and-a-character-value – Ronak Shah Feb 11 '21 at 07:27
  • 1
    The link in @RonakShah's comment is the answer to the question, you may also want to take a look at [thisSO post](https://stackoverflow.com/questions/1169456/the-difference-between-bracket-and-double-bracket-for-accessing-the-el). – Rui Barradas Feb 11 '21 at 08:05
  • See also `fortunes::fortune(106)`. – Rui Barradas Feb 11 '21 at 08:11

1 Answers1

2

You can do that. You really shouldn't but you can.

Like all operators in R, $ is actually a function. Its first argument is a list (possibly but not necessarily of class data.frame) and its second argument a name (an object of type "symbol"). Since $ does not evaluate the second parameter, you need to substitute it if you intend to pass it programmatically.

This is how I would do it:

fun2 <- function(dat, var) {
  var <- as.name(var)
  eval(substitute("$"(dat, var) + 5))
  }

fun2(dat = df, var = "y")
#[1]  6  7  8  9 10
fun2(dat = df, var = M)
#[1]  6  7  8  9 10

See how much nicer this is if you use [ like the R developers intended?

Roland
  • 127,288
  • 10
  • 191
  • 288