2

I have a list of dataframes that I want to put columns together. To illustrate here's a dummy set:

Data1 <- data.frame(A = c(1, 2, 3, 4, 5),
                    B = c(2, 3, 5, 3, 10))
Data2 <- data.frame(A = c(1, 2, 3, 4, 6), 
                    C = c(3, 4, 8, 12, 2))
Data3 <- data.frame(A = c(1, 2, 3, 4, 6), 
                    D = c(4, 3, 1, 9, 2))
list <- list(Data1, Data2, Data3)

I want the output to look like this:

A  B  C  D
1  2  3  4
2  3  4  3
3  5  8  1
4  3 12  9
5 10 NA NA
6 NA  2  2

My real data has many dataframes inside the list, and I have many lists, so I would like the code to not have to explicitly state the name of the dataframes, which I've been doing using the merge() function.

Thank you!

Drew
  • 563
  • 2
  • 8
  • Possible duplicate https://stackoverflow.com/questions/8091303/simultaneously-merge-multiple-data-frames-in-a-list – Ronak Shah May 28 '20 at 03:25

1 Answers1

2

We can use reduce with full_join

library(dplyr)
library(purrr)
reduce(list, full_join, by = 'A')

If there are many list, place them in all in a list, loop over the list and then use reduce

map(list(list1, list2, list3, ..., listn), ~ reduce(.x, full_join, by = 'A'))

Placing the list in a list can be automated with mget

map(mget(ls(pattern = '^list\\d+$')), ~ reduce(.x, full_join, by = 'A'))

Here, we assume the names of the lists as list1, list2, etc.

akrun
  • 874,273
  • 37
  • 540
  • 662
  • Now if I loop over the list, how do I create unique names for each of the new dataframes I am creating? – Drew May 27 '20 at 22:08
  • @Drew. If `out` is the output from the `map`, `map(out, names) %>% unlist %>% unique` gives the unique names across all the datasets. if we need to find for each single dataset, just do `map(out, names)` there will be only a single dataset after the `reduce` step (if that is what you meant) – akrun May 27 '20 at 22:09
  • Hm. I'm not sure how this is creating the names that I want. Let's say I have 4 new dataframes inside **out** <- map(list(list1, list2, list3, list4), ~ reduce(.x, full_join, by = 'A')), and I want to create new names for them, c("A", "B", "C", "D"). What do I do? Again, thank you for your help – Drew May 27 '20 at 22:20
  • try `map(out, set_names, newnamevec)` – akrun May 27 '20 at 22:22
  • or if you meant unique for each data `map(out, names) %>% unlist%make.unique%split(rep(seq_along(out), sapply(out, ncol))%map2(out,.,set_names)` – akrun May 27 '20 at 22:29
  • I meant that I want to create custom names for each dataframe inside the list that I've made using the function you've provided: map(list(list1, list2, list3, ..., listn), ~ reduce(.x, full_join, by = 'A')). The dataframes in **out** are currently unnamed. The map code you've graciously provided have not worked for labeling the dataframes as I'd like. – Drew May 27 '20 at 22:41
  • ifyou have custom names in a list of vectors ue `map2(out, listofnamevec, set_names)` – akrun May 27 '20 at 22:44