Truly understanding lapply et al

Question

I have been using R for a long time and am very happy using the map-family of functions as well as rowwise. I really just don't get the apply-family, even after reading many a tutorial. Right now it's very much up to chance if I get any apply function to work, and if I do, I'm not sure why it did in that case. Could anyone give an intuitive explanation of the syntax? E.g. why does the code below fail?

stupid_function = function(x,y){
  a = sum(x,y)
  b = max(x,y)
  return(list(MySum=a,MyMax=b))
}

mtcars %>%
  rowwise() %>%
  mutate(using_rowwise = list(stupid_function(vs, am))) %>%
  unnest_wider(using_rowwise)

mtcars %>%
  mutate(using_map = pmap(list(vs,am),stupid_function)) %>%
  unnest_wider(using_map)

mtcars %>%
  mutate(using_lapply = lapply(list(vs,am), stupid_function))

Using rowwise and pmap I get what I want/expect. But the last line yields the following error:

 Error: Problem with `mutate()` input `using_lapply`.
x argument "y" is missing, with no default
i Input `using_lapply` is `lapply(list(vs, am), stupid_function)`.
Run `rlang::last_error()` to see where the error occurred.

`lapply` isn't equivalent to `pmap`, it's equivalent to `map`, so your `using_map` example that actually uses `pmap` isn't a good comparison. `lapply` iterates over it's first argument. You want to iterate over 2 arguments. — Gregor Thomas, Aug 21 '20 at 00:17
You mention many tutorials, but in case you haven't seen it the FAQ [on apply functions here on Stack Overflow](https://stackoverflow.com/a/7141669/903061) is my favorite. — Gregor Thomas, Aug 21 '20 at 00:22

Ben Norris · Accepted Answer · 2020-08-21T00:26:06.937

1

The lapply() function has the following usage (from ?lapply).

lapply(X, FUN, ...)

The X argument is a list or vector or data.frame - something with elements. The FUN argument is some function. lapply then applies the FUN to each element of X and returns the outputs in a list. The first element of this list is FUN(X[1])and the second is FUN(X[2]).

In your example, lapply(list(vs,am), stupid_function), lapply is trying to apply stupid_function to vs and then to am. However, stupid_function appears to require two arguments. This is where the ... comes in. You pass additional arguments to FUN here. You just need to name them correctly. So, in your case, you would use lapply(vs, stupid_function, y = am).

However, this isn't really what you want either. This will use all am as the second argument and not iterate over am. lapply only iterates over one variable, not two. You want to use a map function for this or you need to do something like the following:

lapply(1:nrow(mtcars) function(x) {stupid_function(mtcars$vs[x], mtcars$am[x]})

edited Aug 21 '20 at 00:26

answered Aug 21 '20 at 00:16

Ben Norris

5,639
2
6
15

This is a good answer up until the end. With `rowwise` or `pmap`, OP is trying to get `stupid_function(vs[1], am[1])`, `stupid_function(vs[2], am[2])`, .... This isn't cleanly possible with `lapply`, to iterate over two inputs in parallel. The base R equivalent would be `Map`. – Gregor Thomas Aug 21 '20 at 00:19
Your recommended fix of `lapply(vs, stupid_function, y = am)` will give `stupid_function(vs[1], am)`, `stupid_function(vs[2], am)`, ... – Gregor Thomas Aug 21 '20 at 00:20
@GregorThomas - Good catch. I updated my answer to incorporate this point and provide a way to do this with `lapply` if one desperately felt they had to. – Ben Norris Aug 21 '20 at 00:27

Truly understanding lapply et al

1 Answers1