2

I'm trying to define a function that uses dplyr::select to select a column that has the same name in multiple dataframes, so the column's name shouldn't be a relevant input to the user. For instance, I'd like something like this to work for any data frame that has the "Sepal.Length" column inside of it:

sel_Sepal.Length <- function(df) {
        # The code we are looking for...
}

So that I can apply it

sel_Sepal.Length(iris)

To obtain a result like this:

    Sepal.Length
1            5.1
2            4.9
3            4.7
4            4.6
5            5.5
...          ...

I'm aware of this answer for a similar problem. But the difference is that, I'd like the function to work without inputting the column's name, which should be fixed inside the function's code.

This could possibly be considered a trivial question, since one can make the user input the column's name and make it work:

selectvar <- function(df, var) {
        var <- enquo(var)
        df %>%
                select(!!var)
}

selectvar(iris, Sepal.Length)

    Sepal.Length
1            5.1
2            4.9
3            4.7
4            4.6
5            5.5
...          ...

But I think there's a concept I'm missing so I can't make it work the way I asked (without inputting the column to be selected). This is a question asked just for the sake of finding that missing concept. Hope it could help others. Thank you in advance!

Carl
  • 4,232
  • 2
  • 12
  • 24
Farid
  • 447
  • 8
  • 11

1 Answers1

1

I may have misunderstood your question; since you explicitly want the column name to be hard-coded inside the function (at least that's what I infer from "I'd like the function to work without inputting the column's name, which should be fixed inside the function's code") you could do

sel_Sepal.Length <- function(df) {
    df %>% select(Sepal.Length)
}

But this means that there is really not much point to the whole function.

Perhaps you can clarify on the whole point of the exercise?

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • Thank you @MauritsEvers it works. I Think I should have specified it doesn't work when using the function with `lapply`. For example: `lapply(iris, sel_Sepal.Length)` gives an error. Maybe I should reformulate the question. Thank you for your help! – Farid Mar 21 '19 at 23:51
  • @Farid Using `lapply` doesn't make any sense here. `sel_Sepal.Length` is a function accepting a `data.frame` as argument. `lapply` loops through the column `vectors` of a `data.frame`. What are you actually trying to do? I feel that there exist some fundamental misunderstandings about what `lapply` and `dplyr::select` do. – Maurits Evers Mar 21 '19 at 23:54
  • You're rigth! I'm sorry I was asking so I can apply the function to a list of data.frames. But in the `lapply` example I applied it to a data.frame. I'm sorry for the naive question. Maybe I should delete it. – Farid Mar 21 '19 at 23:58
  • @Farid Don't delete the question; you should mark it as answered in which case it might benefit somebody else with a related question in the future. There's absolutely no need to say sorry. – Maurits Evers Mar 22 '19 at 00:01