42

I am sorry if this question has been answered already. Also, this is my first time on stackoverflow.

I have a beginner R question concerning lists , data frames and merge() and/or rbind().

I started with a Panel that looks like this

COUNTRY YEAR VAR
A         1
A         2
B         1
B         2

For efficiency purposes, I created a list that consists of one data frame for each country and performed a variety of calculations on each individual data.frame. However, I cannot seem to combine the individual data frames into one large frame again.

rbind() and merge() both tell me that only replacement of elements is allowed.

Could someone tell me what I am doing wrong/ and how to actually recombine the data frames?

Thank you

CGN
  • 687
  • 1
  • 6
  • 12

5 Answers5

44

Maybe you want to do something like:

do.call("rbind", my.df.list)

Ben
  • 41,615
  • 18
  • 132
  • 227
datanalytics.com
  • 986
  • 7
  • 11
  • Unfortunately, this returns an error (I think because not all panels are balanced?) Either way, the above command worked. Thank you though. – CGN Mar 07 '10 at 05:07
  • That solution works, but it is slow – Kots Oct 05 '17 at 11:21
15

dplyr lets you use bind_rows function for that:

library(dplyr)

foo <- list(df1 = data.frame(x=c('a', 'b', 'c'),y = c(1,2,3)), 
         df2 = data.frame(x=c('d', 'e', 'f'),y = c(4,5,6)))

bind_rows(foo)
lbcommer
  • 985
  • 1
  • 11
  • 20
10

Note that the basic solution

do.call("rbind", my.df.list)

will be slow if we have many dataframes. A scalable solution is:

library(data.table)
rbindlist(my.df.list)

which, from the docs, is the same as do.call("rbind", l) on data.frames, but much faster.

alberto
  • 2,625
  • 4
  • 29
  • 48
4

There might be a better way to do this, but this seems to work and it's straightforward. (My code has four lines so that it's easier to see the steps; these four could easily be combined.)

# first re-create your data frame:
A = matrix( ceiling(10*runif(8)), nrow=4)
colnames(A) = c("country", "year_var")
dfa = data.frame(A)

# now re-create the list you made from the individual rows of the data frame:
df1 = dfa[1,]
df2 = dfa[2,]
df3 = dfa[3,]
df4 = dfa[4,]
df_all = list(df1, df2, df3, df4)

# to recreate your original data frame:
x = unlist(df_all)         # from your list create a single 1D array 
A = matrix(x, nrow=4)      # dimension that array in accord w/ your original data frame
colnames(A) = c("country", "year_var")     # put the column names back on
dfa = data.frame(A)        # from the matrix, create your original data frame
doug
  • 69,080
  • 24
  • 165
  • 199
  • Thank you for the script. It worked quite well, my only worry with this is that it doesn't automatically update if I were to add a country. (although I suppose with a for-loop I could do that, too) – CGN Mar 07 '10 at 05:05
4

plyr is probably best. Another useful approach if the data frames can be different is to use reshape:

library(reshape)
data <- merge_recurse(listofdataframes)

Look at my answer to this related question on merging data frames.

Shane
  • 98,550
  • 35
  • 224
  • 217
  • Do you have any idea why merge_recurse uses so much memory? I have 4 dataframes of 196 rows with one column, and it requires 6gb of memory – Gabriel G. Feb 02 '21 at 21:44