2

I am new to programming and R is my first programming language to learn.

I want to merge 100 dataframes; each dataframe contains one column and 20 observations, as shown below:

df1 <- as.data.frame(c(6,3,4,4,5,...))
df2 <- as.data.frame(c(2,2,3,5,10,...))
df3 <- as.data.frame(c(5,9,2,3,7,...))
...
df100 <- as.data.frame(c(4,10,5,9,8,...))

I tried using df.list <- list(df1:df100) to construct an overall dataframe for all of the dataframes but I am not sure if df.list merges all the columns from all the dataframes together in a table.

Can anyone tell me if I am right? And what do I need to do?

  • 1
    Possible duplicates: https://stackoverflow.com/a/24376207/12993861 or https://stackoverflow.com/a/68136880/12993861 – stefan Jun 26 '21 at 22:30
  • Does this answer your question? [Combine a list of data frames into one data frame by row](https://stackoverflow.com/questions/2851327/combine-a-list-of-data-frames-into-one-data-frame-by-row) – GuedesBF Jun 27 '21 at 02:52

2 Answers2

4

We can use mget to get all the objects into a list by specifying the pattern in 'ls' to check for object names that starts (^) with 'df' followed by one or mor digits (\\d+) till the end ($) of the string

df.list <- mget(ls(pattern = '^df\\d+$'))

From the list, if we can want to cbind all the datasets, use cbind in do.call

out <- do.call(cbind, df.list)

NOTE: It is better not to create multiple objects in the global environment. We could have read all the data into a list directly or constructed within a list i.e. if the files are read from .csv, get all the files with .csv from the directory of interest with list.files, then loop over the files in lapply, read them individually with read.csv and cbind

files <- list.files(path = 'path/to/your/location', 
         pattern = '\\.csv$', full.names = TRUE)
out <- do.call(cbind, lapply(files, read.csv))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thanks! I do understand it's better not to create multiple objects in the global environment. I originally applied a for loop function to generate outputs from a dataset which are the dataframes I got in the global environment. But I then want to sum up all the outputs (dataframes) so I thought combining all these dataframes and sum them up together will be a solution. Is there any alternative method I can use to avoid creating these objects in the global environment whilst achieving the same result? – SssssssAaaaaaa Jun 27 '21 at 14:32
3

We can also use reduce function from purrr package, after creating a character vector of names of data frames:

library(dplyr)
library(purrr)

names <- paste0("df", 1:100)

names %>% 
  reduce(.init = get(names[1]), ~ bind_rows(..1, get(..2)))

Or in base R:

Reduce(function(x, y) rbind(x, get(y)), names, init = get(names[1]))
Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41