0

I'm looking for a scalable method to multiple data frames together, on this condition that the first column matches, while also maintaining the column names which are different for each dataframe.

In this example there are only three data frames, but I'm working with a couple hundred, so am trying to write code that is scalable.

To provide context, the data files are being read from a folder, and I have a for loop to iterate through the files.

files <- list.files(path = "Documents/")

df <- list()

for (i in 1:length(files)) {

   df[[i]] <- read.csv(paste0("Documents/",files[i]))

   # code to perform action goes here

}

Below are the sample inputs and the intended output--

This is df[[1]]

date    blue
1/1/11  5
1/1/12  6
2/1/13  2

This id df[[2]]

date    orange
1/1/11  2
1/1/12  5
2/1/13  2

This is df[[3]]

date    red
1/1/11  4
2/1/13  2

This is the intended output:

date    blue    orange  red
1/1/11  5       2       4
1/1/12  6       5       NA
2/1/13  2       2       2
vw88
  • 139
  • 3
  • 14

3 Answers3

1

The answer is supplied by Matthew Plourde in Merging a lot of data.frames

Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3))
Seymour
  • 3,104
  • 2
  • 22
  • 46
  • Is it possible to use list() in a way so it can take in many dataframes without having to type each one out? (I've edited my question to provide more context) – vw88 Feb 12 '18 at 11:54
  • of course, the dataframe must have a sequential names though. Like `paste0("jay_df_", 1:1000,".rds")`. Obviiously, if they have all a complete different name you must type one by one.. – Seymour Feb 12 '18 at 12:00
0

You could use dplyr.

df1 <- read.table(header=TRUE, text="date    blue
1/1/11  5
1/1/12  6
2/1/13  2")

df2 <- read.table(header=TRUE, text="date    orange
1/1/11  2
1/1/12  5
2/1/13  2")

df3 <- read.table(header=TRUE, text="date    red
1/1/11  4
2/1/13  2")

library(dplyr)
df4 <- full_join(inner_join(df1, df2), df3)

df4
#     date blue orange red
# 1 1/1/11    5      2   4
# 2 1/1/12    6      5  NA
# 3 2/1/13    2      2   2
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • Even though this question is a duplicate, he is asking for a scalable code to work with many dataframes – Seymour Feb 12 '18 at 11:37
-1

Another approach is to use Reduce() and base::merge()

# data sample organized to a list
df1 <- data.frame(date = c("1/1/11", "1/1/12", "2/1/13"), blue = c(5, 6, 2))
df2 <- data.frame(date = c("1/1/11", "1/1/12", "2/1/13"), orange = c(2, 5, 2))
df3 <- data.frame(date = c("1/1/11", "2/1/13"), red = c(4, 2))
df_list <- list(df1, df2, df3)

# solution
merged_df <- Reduce(f = merge, x = df_list)
Ekatef
  • 1,061
  • 1
  • 9
  • 12