Row Binding a Set of Data Sets?

Question

I have read in a list of data sets and called this n. What I wish to do is take a subset of the data sets from n and row bind them together in R. When I try to go rbind(n) this just gives me the data frame of all the names of the data sets instead of actually putting the elements of each data set underneath each other. What I want to do is bind subsets of the data sets that share a common name. For example, 18 of the data sets start with "4." and I want to bind all these together. Is there an easy way to do this?

score 11 · Accepted Answer · edited May 23 '17 at 10:28

11

Want you want to do is rbind(n[[1]],n[[2]],...) which is not the same as rbind(n).

You don't need to write this out, you can use do.call to create and execute this call

 do.call(rbind, n)

which will run the command you want. However, this is notoriously slow

You can use rbindlist from the data.table package to do the same thing much faster

 library(data.table)

 rbindlist(n)

If you only want those elements whose name start with 4

rbindlist(n[grep(names(n), pattern = '^4')])

edited May 23 '17 at 10:28

Community

1
1

answered Feb 25 '13 at 02:45

mnel

113,303
27
265
254

Awesome thank you that does exactly what I want! Now how can I do this for a subset, say files containing "4." somewhere in their name within n without actually having to go look which numbers in the list those files belong to? – user1836894 Feb 25 '13 at 03:09
@user1836894 Are these the names of the elements of the list? – mnel Feb 25 '13 at 03:10
Yes they are. I have a list of files and those are the names belonging to a subset of the files. – user1836894 Feb 25 '13 at 03:13
@user1836894 added an approach, let me know if it isn't suitable. – mnel Feb 25 '13 at 03:19
+1! @mnel maybe change the pattern from numeric to a valid file name! it looks odd a file name beginning with 4. and you can give the list of names `n[filtred.names]`. – agstudy Feb 25 '13 at 03:23
Yes! Thank you that did what I wanted! I'm sorry to be a bother but is it possible to do this bind with a different number of columns in some of the data sets? For example such as 'rbind.fill' would, but with the 'rbindlist' you suggested? – user1836894 Feb 25 '13 at 03:42
I have a `not-optimized` approach [here](http://stackoverflow.com/a/15017231/1385941). This will not be as fast as rbindlist. – mnel Feb 25 '13 at 03:55

score 1 · Answer 2 · answered Feb 25 '13 at 03:31

1

If you try to aggregate many files, You might need the rbind.fill function in plyr package ( i don'konw if there is a data.table equivalent)

ll <- list(a=data.frame(x=1,y=2,z=1),
         b= data.frame(x=2,y=3),
         c=data.frame(x=3:4,y=5))

library(plyr)
Reduce(rbind.fill,ll[c('a','b')])   ## subset by list names and 
                                    ## apply recursively using Reduce
  x y  z
1 1 2  1
2 2 3 NA

answered Feb 25 '13 at 03:31

agstudy

119,832
17
199
261

I don't know of an rbind.fill equivalent, but I implemented something http://stackoverflow.com/a/15017231/1385941 – mnel Feb 25 '13 at 03:55

Row Binding a Set of Data Sets?

2 Answers2