I have a List of 366 Dataframes, each DG contains 3 Columns, i.e; "i", "j" and "Value". I want to merge these data frames in a single data frame to do statistical analysis, like mean, mode, median. each list contains almost the same no. observations?
Asked
Active
Viewed 37 times
-2
-
1Since they do not have the same observations, perhaps you mean to combine them so that you have one frame with three columns, is that right? Perhaps you want a fourth column to indicate which frame they originally belongs to? – r2evans Feb 17 '20 at 21:23
-
https://stackoverflow.com/questions/2851327/convert-a-list-of-data-frames-into-one-data-frame might be what you are looking for. – Ronak Shah Feb 18 '20 at 00:28
2 Answers
1
Base R options:
set.seed(42)
listdat <- replicate(3, data.frame(i=sample(100, size=2), j=sample(100, size=2), Value=sample(100, size=2)), simplify = FALSE)
str(listdat)
# List of 3
# $ :'data.frame': 2 obs. of 3 variables:
# ..$ i : int [1:2] 92 93
# ..$ j : int [1:2] 29 83
# ..$ Value: int [1:2] 65 52
# $ :'data.frame': 2 obs. of 3 variables:
# ..$ i : int [1:2] 74 14
# ..$ j : int [1:2] 66 70
# ..$ Value: int [1:2] 46 72
# $ :'data.frame': 2 obs. of 3 variables:
# ..$ i : int [1:2] 94 26
# ..$ j : int [1:2] 47 94
# ..$ Value: int [1:2] 98 12
Starting with that, the first thing we can do is just combine them row-wise, all in one go:
do.call(rbind, listdat)
# i j Value
# 1 92 29 65
# 2 93 83 52
# 3 74 66 46
# 4 14 70 72
# 5 94 47 98
# 6 26 94 12
It might be nice to include which index they came from. If they are not named, then you can just include the index number:
do.call(rbind, Map(cbind, listdat, num=seq_along(listdat)))
# i j Value num
# 1 92 29 65 1
# 2 93 83 52 1
# 3 74 66 46 2
# 4 14 70 72 2
# 5 94 47 98 3
# 6 26 94 12 3
If they have names, however, we can use the same technique:
names(listdat) <- c("A","B","C")
do.call(rbind, Map(cbind, listdat, name=names(listdat)))
# i j Value name
# A.1 92 29 65 A
# A.2 93 83 52 A
# B.1 74 66 46 B
# B.2 14 70 72 B
# C.1 94 47 98 C
# C.2 26 94 12 C
Per @akrun's commented suggestion, here are two external-package suggestions that are a bit shorter.
# 'dplyr'
dplyr::bind_rows(listdat) # if no names present
dplyr::bind_rows(listdat, .id = 'name') # with names
# 'data.table'
data.table::rbindlist(listdat) # if no names present
data.table::rbindlist(listdat, idcol = 'name') # with names

r2evans
- 141,215
- 6
- 77
- 149
-
1Or if it is `dplyr` `bind_rows(lstdat,.id = 'name')` or may be it is more efficient with `rbindlist(lstdat, idcol = 'name')` as the OP have lots of datasets – akrun Feb 17 '20 at 21:38
-
Yes, I thought about that, I just didn't have time up-front to demo those admittedly much-shorter snippets. Thanks! – r2evans Feb 17 '20 at 21:55
0
Assuming the data sets are in your working directory & have some unique identifier in filename (e.g. "dataset": "dataset1.csv", "dataset2.csv", "dataset3.csv", etc...), and you don't mind using tidyverse
, the following should work:
library(tidyverse)
file_names <- list.files() %>%
str_extract(., "dataset")
my_df <- map(file_names, ~ read_csv(.x)) %>% bind_rows()

Adam B.
- 788
- 5
- 14
-
Please consider including only the packages that you need. `tidyverse` is a gargantuan meta-package that not everybody has installed or can install. (I have several computers where I cannot install arbitrary packages.) In this case, we only need `dplyr`, `purrr`, `stringr`, and `readr`. Just like we encourage questions to be specific on listing non-base packages, it is also a courtesy to provide the same level of specificity in our answers. Thanks! – r2evans Feb 17 '20 at 21:30
-
1Duly noted. I'm a bit of a `tidyverse` partizan and I think it's especially helpful for new users just starting out with R because it's a lot more human friendly than base R (which I myself started learning on), and for the average, non-expert user who just wants to run some analyses on their laptop & not have to worry about things like optimization, tidyverse is often a life-saver. I'll add an addendum though. – Adam B. Feb 17 '20 at 21:36
-
1I'm not arguing against the use of tidyverse packages (that's a completely separate topic), I'm just commenting on the fact that not everybody has tidyverse installed, and that's quite a long/big ordeal if you do it "just" to try this answer. That's all, thanks! – r2evans Feb 17 '20 at 21:54