-1

I have many data.frames in different length, and I want to concatenate them one by one by row.names in R, if one data.frame miss some names, then fill up the value with 0. e.g.

df1

A 1
B 1
C 1
D 1

df2

C 2
D 2
E 2

What I want is

New_df

A 1 0
B 1 0
C 1 2
D 1 2
E 0 2

Have tried several ways but none of them work. Thanks!!

Pierre Lapointe
  • 16,017
  • 2
  • 43
  • 56
  • https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right – nak5120 Jun 26 '17 at 21:19
  • Can we assume that your original data frames do **not** contain any `NA`? This would simplify the answer to initialize missing columns with `0`s... – R Yoda Jun 26 '17 at 21:31
  • rbindlist from data.table with use.names = T will do this automatically for you! But then you'd have to use the data.table package instead of base R. – be_green Jun 26 '17 at 22:29

4 Answers4

2

As OP wants to merge many data.frames, using Reduce makes sense. First, you have to put your dfs in a list and use Reduce like so:

df1 <- read.table(text="ID Val1
A 1
B 1
C 1
D 1",header=TRUE,stringsAsFactors=FALSE)

df2 <- read.table(text="ID Val2
C 2
D 2
E 2",header=TRUE,stringsAsFactors=FALSE)

df3 <- read.table(text="ID Val3
C 2
D 2
F 4",header=TRUE,stringsAsFactors=FALSE)

df_list <- list(df1,df2,df3)

res <- Reduce(function(x, y) merge(x, y, by="ID", all=TRUE), df_list)
res[is.na(res)] <- 0

  ID Val1 Val2 Val3
1  A    1    0    0
2  B    1    0    0
3  C    1    2    2
4  D    1    2    2
5  E    0    2    0
6  F    0    0    4
Pierre Lapointe
  • 16,017
  • 2
  • 43
  • 56
1

For the case when you have more than 2 data frames to merge:

df1 <- read.table(text="A 1
                  B 1
                  C 1
                  D 1", stringsAsFactor=F)

df2 <- read.table(text="C 2
                  D 2
                  E 2", stringsAsFactor=F)

dfs <- list(df1, df2)
df_new <- Reduce(function(...) merge(..., all=T, by="V1"), dfs)
df_new[is.na(df_new)] <- 0
emilliman5
  • 5,816
  • 3
  • 27
  • 37
-1
col1<-c("A", "B", "C","D")
col2<-c(1,1,1,1)
df1<-data.frame(col1, col2)

col1<-c("C", "D", "E")
col2<-c(2,2,2)
df2<-data.frame(col1, col2)


df3<-rbind(df1, df2)
df4<-data.frame(unique(df3$col1))
colnames(df4)[1]<-"ID"
df5<-left_join(df4, df1, by=c("ID"="col1"))
df6<-left_join(df5,df2, by=c("ID"="col1"))
df6[is.na(df6)] <- 0
nak5120
  • 4,089
  • 4
  • 35
  • 94
  • 2
    You may want to add `library(dplyr)` for completeness. But even you do that, the OP wants to merge more than two data frames. I don't think this is what the OP wants. – www Jun 26 '17 at 21:34
-1

Slight modification of rsmith45's answer:

library(tidyverse)
df1 <-
  data_frame(id = c("A", "B", "C", "D"),
             val_1 = c(1, 1, 1, 1))
df2 <-
  data_frame(id = c("C", "D", "E"),
             val_2 = c(2, 2, 2))

new_df <- full_join(df1, df2) %>%
  replace_na(list(val_1 = 0, val_2 = 0))

> new_df
# A tibble: 5 x 3
     id val_1 val_2
  <chr> <dbl> <dbl>
1     A     1     0
2     B     1     0
3     C     1     2
4     D     1     2
5     E     0     2

the replace_na thing is cool, because you can use it to fill NAs with other values depending on the column:

new_df <- full_join(df1, df2) %>%
  replace_na(list(val_1 = -99, val_2 = -1))

> new_df
# A tibble: 5 x 3
     id val_1 val_2
  <chr> <dbl> <dbl>
1     A     1    -1
2     B     1    -1
3     C     1     2
4     D     1     2
5     E   -99     2
Alex Coppock
  • 2,122
  • 3
  • 15
  • 31