3

I have a series of similar functions that all need to extract some values from a data frame. Something like this:

foo_1 <- function(data, ...) {
  x <- data$x
  y <- data$y

  # some preparatory code common to all foo_X functions

  # .. do some boring stuff with x and y

  # pack and process the result into 'ret'
  return(ret)
}

These functions are then provided as arguments to some other function (let us call it "the master function". I cannot modify the master function).

However, I wish I could avoid re-writing the same preparatory code in each of these functions. For example, I don't want to use data$x instead of assigning it to x and using x because it makes the boring stuff hard to read. Presently, I need to write x <- data$x (etc.) in all of the foo_1, foo_2... functions. Which is annoying and clutters the code. Also, packing and processing is common for all the foo_N functions. Other preparatory code includes scaling of variables or regularization of IDs.

What would be an elegant and terse way of doing this?

One possibility is to attach() the data frame (or use with(), as Hong suggested in the answer below), but I don't know what other variables would then be in my name space: attaching data can mask other variables I use in fun_1. Also, preferably the foo_N functions should be called with explicit parameters, so it is easier to see what they need and what they are doing.

Next possibility I was thinking of was a construction like this:

foo_generator <- function(number) {

  tocall <- switch(1=foo_1, 2=foo_2, 3=foo_3) # etc.

  function(data, ...) {
    x <- data$x
    y <- data$y
    tocall(x, y, ...)
    # process and pack into ret
    return(ret)
}

foo_1 <- function(x, y, ...) {
  # do some boring stuff
}

Then I can use foo_generator(1) instead of foo_1 as the argument for the master function.

Is there a better or more elegant way? I feel like I am overlooking something obvious here.

January
  • 16,320
  • 6
  • 52
  • 74
  • Can you make a new function for your variable assignment and common code so you don't have to repeat it a bunch of times? Then call that function within your foo_N functions? Or don't put the common stuff in a function and have the common code part in your global environment? – Mike Jul 25 '19 at 16:35

7 Answers7

4

You might be overthinking it. You say that the code dealing with preparation and packing are common to all foo_n functions. I assume, then, that # .. do some boring stuff with x and y is where each function differs. If that's the case then just create a single prep_and_pack function which takes a function name as a parameter, and then pass in foo_1, foo_2, etc. For example:

prep_and_pack <- function(data, func){
    x <- data$x
    y <- data$y

    # preparatory code here

    xy_output <- func(x, y) # do stuff with x and y

    # code to pack and process into "ret"

    return(ret)
}

Now you can create your foo_n functions that do different things with x and y:

foo_1 <- function(x, y) {
    # .. do some boring stuff with x and y
}

foo_2 <- function(x, y) {
    # .. do some boring stuff with x and y
}

foo_3 <- function(x, y) {
    # .. do some boring stuff with x and y
}

Finally, you can pass multiple calls to prep_and_pack into your master function, where foo_1 etc. are passed in via the func argument:

master_func(prep_and_pack(data = df, func = foo_1),
            prep_and_pack(data = df, func = foo_2),
            prep_and_pack(data = df, func = foo_3)
            )

You could also use switch in prep_and_pack and/or forgo the foo_n functions completely in favor of if-else conditionals to deal with the various cases, but I think the above keeps things nice a clean.

  • I assigned the bounty because your answer got the most votes; however, none of the presented solutions suit me. For example, since I am passing the function as an argument to another function, I cannot control how it is called or give it an extra arg. – January Aug 01 '19 at 09:10
  • Would my solution not still work? You can just call the `foo_n` the way you like, but the preparatory code is all nicely contained in the `prepFun`. Or did I miss something? – Willem Aug 01 '19 at 10:07
  • Does it help, if one rewrites `prep_and_pack` as a function generator? I.e. just adding `generator <- function(data, func) function(data, ...) { ...; output <- func(x,y, ...); ...}` – Alexey Aug 01 '19 at 14:13
  • @January it's not unusual to pass in extra args for an inner function as an arg (or as args) to the outer function, e.g. `prep_and_pack(data = df, func = foo_1, func_args = list(arg1 = val1, arg2 = val2, ...))`. I'm not quite sure what you mean by "how it is called". Can you describe your issue in more detail? –  Aug 01 '19 at 15:33
  • I am building a widget factory for ggplot. The functions are the drawing functions which are passed as arguments to the `draw_group` argument of `ggproto`, they are called with standard set of arguments that I cannot influence. Adding a generator helps, that is true, but this is exactly what I have shown in my question. – January Aug 02 '19 at 07:17
  • In the question, the generator has to know about existence of `func_01`, whereas in the provided answer one may implement new functions `func_x` without touching the generator code, because the function is provided as an argument. – Alexey Aug 02 '19 at 11:08
3

The requirements still seem a bit vague to me, but if your code is so similar that you can simply wrap it around a helper function like tocall in your example, and your input is in a list-like structure (like a data frame which is just a list of columns), then just write all your foo_* functions to take the "spliced" parameters like in your proposed solution, and then use do.call:

foo_1 <- function(x, y) {
  x + y
}

foo_2 <- function(x, y) {
  x - y
}

data <- list(x = 1:2, y = 3:4)

do.call(foo_1, data)
# [1] 4 6

do.call(foo_2, data)
# [1] -2 -2
Alexis
  • 4,950
  • 1
  • 18
  • 37
3

I'm not sure the following is a good idea. It reminds me a bit of programming with macros. I don't think I would do this. You'd need to carefully document because it is unexpected, confusing and not self-explanatory.

If you want to reuse the same code in different functions, it might be an option to create it as an unevaluated call and evaluate that call in the different functions:

prepcode <- quote({
  x <- data$x
  y <- data$y
  }
)

foo_1 <- function(data, ...) {
  eval(prepcode)
  # some preparatory code common to all foo_X functions

  # .. do some boring stuff with x and y

  # pack and process the result into 'ret'
  return(list(x, y))
}

L <- list(x = 1, y = "a")
foo_1(L)
#[[1]]
#[1] 1
#
#[[2]]
#[1] "a"

It might be better, to then have prepcode as an argument to foo_1 to make sure there won't be any scoping issues.

Roland
  • 127,288
  • 10
  • 191
  • 288
1

Use with inside the function:

foo_1 <- function(data, ...) {
  with(data, {
    # .. in here, x and y refer to data$x and data$y
  }
}
Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • This has the same drawback as `attach`. `data` may contain loads of other stuff and I don't control it and I can't remove the other stuff. Consider `adjustment=9 ; with(data, print(adjustment))` when `data$adjustment != NULL`. – January Jul 11 '19 at 09:09
  • You are _defining_ `foo_1` and you thus control everything that it does, and everything that it needs. – Hong Ooi Jul 11 '19 at 09:28
  • That general statement is certainly true, but using `with` is one of the ways one can sometimes loose that control by using an environment which is built on data which I don't control , don't you agree? I could, of course, put all the code within the `with()` call, but then (i) it would not be very elegant and (ii) it would add to the things that can go wrong and I need to worry about. Or I could construct `data2` using only what I need and then use `with` on that. But then again I need to repeat boring code in every function. Not to mention the fact that I would prefer to call `foo(x,y)`. – January Jul 11 '19 at 09:41
  • The solution is to write `foo_1` so that you don't have to worry about clashing variable names. But also, there are clearly requirements that you didn't put into the question. Voting to close as unclear.... – Hong Ooi Jul 11 '19 at 09:46
  • Which requirements would that be? Also, yes, of course the solution *is* to write `foo_1` such that I don't need to worry about clashing variable names, I just say that using `with` is a way of making sure that I *do* need to worry about clashing variable names. – January Jul 11 '19 at 09:58
1

I'm not sure I understand fully, but can't you simply use a function for all common stuff, and then unpack that into the foo_N functions using list2env? For example:

prepFun <- function(data, ...) {
    x <- data$x
    y <- data$y
    tocall(x, y, ...)
    # process and pack into a named list called ret
    # so you know specifically what the elements of ret are called
    return(ret)
}

foo_1 <- function(data, ...) {

  # Get prepFun to do the prepping, then use list2env to get the result in foo_1
  # You know which are the named elements, so there should be no confusion about
  # what does or does not get masked.
  prepResult <- prepFun(data, ...)
  list2env(prepResult)

  # .. do some boring stuff with x and y

  # pack and process the result into 'ret'
  return(ret)

}

Hope this is what you're looking for!

Willem
  • 976
  • 9
  • 24
1

I think defining a function factory for this task is a bit overkill and confusing. You can define a general function and use purrr::partial() on it when passing it to your master function.

Something like :

foo <- function(data, ..., number, foo_funs) {
  tocall <- foo_funs[[number]])
  with(data[c("x", "y")], tocall(x, y, ...))
    # process and pack into ret
   return(ret)
}

foo_1 <- function(x, y, ...) {
  # do some boring stuff
}

foo_funs <- list(foo_1, foo_2, ...)

Then call master_fun(fun = purrr::partial(foo, number =1) , ...)

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
0

another possibility is to use list2env which saves the components of a list in to a specified environment:

foo_1 <- function(data){
  list2env(data, envir = environment())
  x + y
}

foo_1(data.frame(x = 1:2, y = 3:4))

See also this question.

Cettt
  • 11,460
  • 7
  • 35
  • 58
  • How would that help? Assigning local variables was only one of the tasks that need to be done in each and every of the `foo_X` functions. – January Jul 25 '19 at 11:32