-2

Many thanks in advance for any advices or hints. I'm working with data frames. The simplified coding is as follows: `

f<-funtion(name){
    x<-tapply(name$a,list(name$b,name$c),sum)
1)  y<-dataset[[deparse(substitute(name))]]
    #where dataset is an already existed list object with names the same as the 
    #function argument. I would like to avoid inputting two arguments.
    z<-vector("list",n) #where n is also defined already
2)  for (i in 1:n){z[[i]]<-x[y[[i]],i]}
    ...
}
lapply(list_names,f) 

`

The warning message is: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'

and the output is incorrect. I tried debugging and found the conflict may lie in line 1) and 2). However, when I try f(name) it is perfectly fine and the output is correct. I guess the problem is in lapply and I searched for a while but could not get to the point. Any ideas? Many thanks!

The structure of the data

Thanks Joran. Checking again I found the problem might not lie in what I had described. I produce the full code as follows and you can copy-paste to see the error.

n<-4
name1<-data.frame(a=rep(0.1,20),b=rep(1:10,each=2),c=rep(1:n,each=5),
                  d=rep(c("a1","a2","a3","a4","a5","a6","a7","a8","a9","a91"),each=2))
name2<-data.frame(a=rep(0.2,20),b=rep(1:10,each=2),c=rep(1:n,each=5),
                  d=rep(c("a1","a2","a3","a4","a5","a6","a7","a8","a9","a91"),each=2))
name3<-data.frame(a=rep(0.3,20),b=rep(1:10,each=2),c=rep(1:n,each=5),
                  d=rep(c("a1","a2","a3","a4","a5","a6","a7","a8","a9","a91"),each=2))
#d is the name for the observations. d corresponds to b.
dataset<-vector("list",3)
names(dataset)<-c("name1","name2","name3")
dataset[[1]]<-list(c(1,2),c(1,2,3,4),c(1,2,3,4,5,10),c(4,5,8))
dataset[[2]]<-list(c(1,2,3,5),c(1,2),c(1,2,10),c(2,3,4,5,8,10))
dataset[[3]]<-list(c(3,5,8,10),c(1,2,5,7),c(1,2,3,4,5),c(2,3,4,6,9))
f<-function(name){
  x<-tapply(name$a,list(name$b,name$c),sum)
  rownames(x)<-sort(unique(name$d)) #the row names for 
  y<-dataset[[deparse(substitute(name))]]
  z<-vector("list",n)
  for (i in 1:n){
    z[[i]]<-x[y[[i]],i]}
  nn<-length(unique(unlist(sapply(z,names)))) # the number of names appeared
  names_<-sort(unique(unlist(sapply(z,names)))) # the names appeared add to the matrix 
                                                # below
  m<-matrix(,nrow=nn,ncol=n);rownames(m)<-names_
  index<-vector("list",n)
  for (i in 1:n){
    index[[i]]<-match(names(z[[i]]),names_)
    m[index[[i]],i]<-z[[i]]
  }
  return(m)
}
list_names<-vector("list",3)
list_names[[1]]<-name1;list_names[[2]]<-name2;list_names[[3]]<-name3
names(list_names)<-c("name1","name2","name3")
lapply(list_names,f)
f(name1)

the lapply(list_names,f) would fail, but f(name1) will produce exactly the matrix I want. Thanks again.

Paul
  • 13
  • 3
  • 3
    Can you provide a reproducible example? – Jonathan Christensen Jan 19 '13 at 02:31
  • can you provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Also, you shouldn't need the gymnastics of `deparse(substitute(..))`, `dataset[[name]]` should work fine. But without your `dataset`, `n` and `list_names` we can't help much. – Justin Jan 19 '13 at 02:33
  • Thanks Jonathan and Justin. I have updated the data structure, is it more clear to you now? Thanks for your patience. – Paul Jan 19 '13 at 03:16
  • 1
    No, it is not any more clear. Go back to the link Justin provided and read it carefully to learn how to provide a _reproducible_ example. We should be able to copy+paste code into an R session and run it ourselves. – joran Jan 19 '13 at 03:36
  • Thanks again joran, the example had been added and you can copy-paste and see. As I am not sure about where the problem lies in, I tried to produce the whole coding as simple as possible in the example. Thanks again. – Paul Jan 19 '13 at 05:31

1 Answers1

5

Why it doesn't work

The issue is the calling stack doesn't look the same in both cases. In lapply, it looks like

[[1]]
lapply(list_names, f) # lapply(X = list_names, FUN = f)

[[2]]
FUN(X[[1L]], ...)

In the expression being evaluated, f is called FUN and its argument name is called X[[1L]].

When you call f directly, the stack is simply

[[1]]
f(name1) # f(name = name1)

Usually this doesn't matter, but with substitute it does because substitute cares about the name of the function argument, not its value. When you get to

y<-dataset[[deparse(substitute(name))]]

inside lapply it's looking for the element in dataset named X[[1L]], and there isn't one, so y is bound to NULL.

A way to get it to work

The simplest way to deal with this is probably to just have f operate on character strings and pass names(list_names) to lapply. This can be accomplished fairly easily by changing the beginning of f to

f<-function(name){
  passed.name <- name
  name <- list_names[[name]]
  x<-tapply(name$a,list(name$b,name$c),sum)
  rownames(x)<-sort(unique(name$d)) #the row names for 
  y<-dataset[[passed.name]]
# the rest of f...

and changing lapply(list_names, f) to lapply(names(list_names),f). This should give you what you want with nearly minimal modification, but you also might consider also renaming some of your variables so the word name isn't used for so many different things--the function names, the argument of f, and all the various variables containing name.

user1935457
  • 721
  • 4
  • 7