43

Let's say I have a list of data.frames

dflist <- list(data.frame(a=1:3), data.frame(b=10:12, a=4:6))

If i want to extract the first column from each item in the list, I can do

lapply(dflist, `[[`, 1)
# [[1]]
# [1] 1 2 3
# 
# [[2]]
# [1] 10 11 12

Why can't I use the "$" function in the same way

lapply(dflist, `$`, "a")
# [[1]]
# NULL
# 
# [[2]]
# NULL

But these both work:

lapply(dflist, function(x) x$a)
`$`(dflist[[1]], "a")

I realize that in this case one could use

lapply(dflist, `[[`, "a")

but I was working with an S4 object that didn't seem to allow indexing via [[. For example

library(adegenet)
data(nancycats)
catpop <- genind2genpop(nancycats)
mylist <- list(catpop, catpop)

#works
catpop[[1]]$tab

#doesn't work
lapply(mylist, "$", "tab")
# Error in slot(x, name) : 
#   no slot of name "..." for this object of class "genpop"

#doesn't work
lapply(mylist, "[[", "tab")
# Error in FUN(X[[1L]], ...) : this S4 class is not subsettable
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • This one works `lapply(dflist, function(x) "$"(x, "a"))`. – Tim May 08 '15 at 20:00
  • 1
    Nice question. Fyi, the answer is sort of findable with `methods("$",dflist[[1]])` – Frank May 08 '15 at 20:00
  • 2
    Well @Frank, it's not that i was unaware that `$.data.frame` existed, I'm just surprised the problem was caused by method dispatching. I can't think of many other cases where you have to explicitly call one form of a generic function. – MrFlick May 08 '15 at 20:02
  • @MrFlick -- I'm with you on that. It must (?) have something to do with the odd lazy-evaluation aspects of `lapply()`, but since some (or, really, all) of that happens down at the level of C-code, I've never been fully able to grasp what it's doing under the hood. – Josh O'Brien May 08 '15 at 20:05
  • 1
    @JoshO'Brien I think it might have more to do with deparsing of parameters with `$` than lapply specifically. See this example: `f<-function(x,...) \`$\`(x, ...); f(dflist[[1]], "a"); \`$\`(dflist[[1]], "a")`. This is because `$` isn't a "typical" generic, it's a `.Primitive()` so i bet the secret lies [here](https://github.com/wch/r-source/blob/f19bb79b339dd487fe9dc3ed1b4686f26dcb1974/src/main/subset.c#L1132) – MrFlick May 08 '15 at 20:13
  • Thanks for this link. I did some search but didn't found this one. Mabye `$` is not google-friendly. – mt1022 May 24 '18 at 14:29

2 Answers2

31

For the first example, you can just do:

lapply(dflist, `$.data.frame`, "a")

For the second, use the slot() accessor function

lapply(mylist, "slot", "tab")

I'm not sure why method dispatch doesn't work in the first case, but the Note section of ?lapply does address this very issue of its borked method dispatch for primitive functions like $:

 Note:

 [...]

 For historical reasons, the calls created by ‘lapply’ are
 unevaluated, and code has been written (e.g., ‘bquote’) that
 relies on this.  This means that the recorded call is always of
 the form ‘FUN(X[[i]], ...)’, with ‘i’ replaced by the current
 (integer or double) index.  This is not normally a problem, but it
 can be if ‘FUN’ uses ‘sys.call’ or ‘match.call’ or if it is a
 primitive function that makes use of the call.  This means that it
 is often safer to call primitive functions with a wrapper, so that
 e.g. ‘lapply(ll, function(x) is.numeric(x))’ is required to ensure
 that method dispatch for ‘is.numeric’ occurs correctly.
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • Thanks so much for your insights. The `slot()` function is what I was really after in the end so I appreciate you bringing that to my attention. I've added another answer which i think get's closer to the "why" in this case. It's really more about the generic `$` implementation rather than `lapply()` from what i understand at this point. – MrFlick May 08 '15 at 20:48
13

So it seems that this problem has more to do with $ and how it typically expects unquoted names as the second parameter rather than strings. Look at this example

dflist <- list(
    data.frame(a=1:3, z=31:33), 
    data.frame(b=10:12, a=4:6, z=31:33)
)
lapply(dflist, 
    function(x, z) {
        print(paste("z:",z)); 
        `$`(x,z)
    }, 
    z="a"
)

We see the results

[1] "z: a"
[1] "z: a"
[[1]]
[1] 31 32 33

[[2]]
[1] 31 32 33

so the z value is being set to "a", but $ isn't evaluating the second parameter. So it's returning the "z" column rather than the "a" column. This leads to this interesting set of results

a<-"z"; `$`(dflist[[1]], a)
# [1] 1 2 3
a<-"z"; `$`(dflist[[1]], "z")
# [1] 31 32 33

a<-"z"; `$.data.frame`(dflist[[1]], a)
# [1] 31 32 33
a<-"z"; `$.data.frame`(dflist[[1]], "z")
# [1] 31 32 33

When we call $.data.frame directly we are bypassing the standard deparsing that occurs in the primitive prior to dispatching (which happens near here in the source).

The added catch with lapply is that it passes along arguments to the function via the ... mechanism. For example

lapply(dflist, function(x, z) sys.call())
# [[1]]
# FUN(X[[2L]], ...)

# [[2]]
# FUN(X[[2L]], ...)

This means that when $ is invoked, it deparses the ... to the string "...". This explains this behavior

dflist<- list(data.frame(a=1:3, "..."=11:13, check.names=F))
lapply(dflist, `$`, "a")
# [[1]]
# [1] 11 12 13

Same thing happens when you try to use ... yourself

f<-function(x,...) `$`(x, ...); 

f(dflist[[1]], "a");
# [1] 11 12 13
`$`(dflist[[1]], "a")
# [1] 1 2 3
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • I feel like I might be being dense, but I don't quite see how this explains why ``lapply(dflist, `$`, "a")`` returns `NULL`. After all, `"$"(dflist[[1]], "z")` returns `31:33`, but the apparently equivalent call, ``lapply(dflist[1], `$`, "z")``, returns `NULL`... What am I missing? – Josh O'Brien May 08 '15 at 21:09
  • Ah, ok. well, there is an extra level with `lapply`. It passes in parameters with `...`. So what's happening if you intercept the call, you see that the second parameter passed to the function is "...". So you're right. It does also have to do with how lapply pases arguments via `...`. I'll add that in as well. – MrFlick May 08 '15 at 21:12
  • 1
    Oh, man, I see what you're getting at. That's fascinating! Check this out: ``df <- data.frame("..."=1:3, z=31:33); dflist <- list(df, df); lapply(dflist, `$`, "z")`` – Josh O'Brien May 08 '15 at 21:16
  • @JoshO'Brien Heh, yea, that's exactly what I was editing in! :) – MrFlick May 08 '15 at 21:17