3

I would like to make a series of plots using ggplot from multiple different dataframes. I was planning on using a list and iterating over the list as follows:

libraries <- objects() #make a list of the dataframes we want to graph
for(i in libraries) {
  # create initial plots
  x1 <- qplot(data= i, V1, reorder(V2,V3), color = V3) + coord_flip()
  x2 <- ggplot(i, aes(x=reorder(V2,V3), group=V3, color=V3)) + geom_bar() 
  x3 <- ggplot(i, aes(x=V1, group=V3, color=V3)) + coord_flip() + geom_bar()
}

however I get the error message:

Error: ggplot2 doesn't know how to deal with data of class factor

presumably because 'libraries' is now a character variable and not a data frame. Any one have another suggestion on how to iterate through the dataframes? I suppose I could merge them with plyr and then ggplot a subset of the data but that seems to add more work.

zach
  • 29,475
  • 16
  • 67
  • 88

2 Answers2

3

The usual way to iterate over data.frames (which are just regularly organized lists) is with lapply:

 df1 <- data.frame(date = as.Date(10*365*rbeta(100, .5, .1)),group="a")
  df2 <- data.frame(date = as.Date(10*365*rbeta(50, .1, .5)),group="b")
  df3 <- data.frame(date = as.Date(10*365*rbeta(25, 3,3)),group="c")
  dfrmL <- list(df1,df2,df3)

 lapply(dfrmL, NROW)
[[1]]
[1] 100

[[2]]
[1] 50

[[3]]
[1] 25

In the case of producing a list of ggplot-objects I would imagine that the Hadley-method would instead be to use llply, but I'm not a skilled plyr-user, so let me suggest this totally untested code template:

plts <- lapply(dfrmL, function(df) qplot(qplot(data= df, 
                                          V1, reorder(V2,V3), color = V3) + 
                                  coord_flip()
       )  
 # you may need to explicitly print() or plot() the plots as stated in the R-FAQ.    
lapply(plts, print)
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • thanks @DWin but what about if you want to use many dataframes? Lets say I have a 30 dataframes. I suppose I could manually write a list but the idea of using the objects() was to get the list of dataframes. Do you know of a way to get an `lapply` -able list of dataframes in your workspace? – zach Oct 16 '11 at 23:23
  • Your workspace is an environment and there is an `eapply` function. – IRTFM Oct 16 '11 at 23:26
  • @Zach - check out this question for an example of identifying all objects of a given class (data.frame for example) which may be useful for you: http://stackoverflow.com/questions/5158830/identify-all-objects-of-given-class-for-further-processing – Chase Oct 16 '11 at 23:31
  • @DWin I think joran has a good way of getting a useable list of datasets and I will try to iterate through them as you suggest. thanks – zach Oct 16 '11 at 23:54
3

A more complete, reproducible example may help us suggest a better way to accomplish this, but at the very least I can suggest replacing:

libraries <- objects()

with this

libraries <- lapply(objects(), FUN = get)

which will actually build a list of all the objects in the current environment. But I somehow doubt that the data frames are the only objects in your environment, so perhaps you would rather grab the list of objects using objects or ls, use grep (or a related function) to find only your data frames based on their names and then get just those data frames using lapply.

Finally, you can then iterate over them as @Dwin describes.

joran
  • 169,992
  • 32
  • 429
  • 468
  • thanks joran. I am going for something like this: `df1 <- diamonds[1:100,] df2 <- diamonds[101:200,] libraries <- lapply(objects(), FUN = get) #make a list of the dataframes we want to graph plts <- lapply(libraries, function(df) qplot(qplot(data= df, x,y, color = clarity)) ) lapply(plts, funtion(pic) print(pic) png(file=pic.jpg) dev(off) )` – zach Oct 16 '11 at 23:52
  • try `ggsave` instead of `png+dev.off` (you'd need `print()` in the middle, btw). – baptiste Oct 17 '11 at 00:13