0

In R: How can I insert a small data frame into a bigger one and simultaneously add new rows if necessary? The data frames should normally be merged by ID1, ID2 and ID3. But the bigger data frame got all row names of the smaller one.

Main data frame to be inserted in (NAs are left blank):

ID1  ID2  ID3  r1  r2  r3  r4  r5  r6
 A    1    a       d1
 A    1    b       d2
 A    2    a       d3

Small data frame:

ID1  ID2  ID3  r5
 A    1    a   d4
 A    1    b   d5
 A    2    a   d6
 A    2    b   d7

Desired output:

ID1  ID2  ID3  r1  r2  r3  r4  r5  r6
 A    1    a       d1          d4
 A    1    b       d2          d5
 A    2    a       d3          d6
 A    2    b                   d7
user1
  • 404
  • 1
  • 5
  • 18

2 Answers2

0

You need to join the 2 tables up, there are lots of methods and packages to do this but I am always a fan of the tidyverse, in this case dplyr joins.

df_joined <- full_join(df1, df2, by = c('ID1','ID2','ID3'))

There's also a coalesce function in dplyr that you may need if you need if r5 can be in either dataframe

Quixotic22
  • 2,894
  • 1
  • 6
  • 14
0

You can use -

nr <- max(nrow(big_df), nrow(small_df))
big_df[1:nr, colnames(small_df)] <- small_df
big_df

#  ID1 ID2 ID3 r1   r2 r3 r4 r5 r6
#1   A   1   a NA   d1 NA NA d4 NA
#2   A   1   b NA   d2 NA NA d5 NA
#3   A   2   a NA   d3 NA NA d6 NA
#4   A   2   b NA <NA> NA NA d7 NA

data

It is easier to help if you share data in a reproducible format.

big_df <- structure(list(ID1 = c("A", "A", "A"), ID2 = c(1L, 1L, 2L),
 ID3 = c("a", "b", "a"), r1 = c(NA, NA, NA), r2 = c("d1", "d2", "d3"), r3 = c(NA,
NA, NA), r4 = c(NA, NA, NA), r5 = c(NA, NA, NA), r6 = c(NA, NA, 
NA)), class = "data.frame", row.names = c(NA, -3L))

small_df <- structure(list(ID1 = c("A", "A", "A", "A"), ID2 = c(1L, 1L, 2L, 
2L), ID3 = c("a", "b", "a", "b"), r5 = c("d4", "d5", "d6", "d7"
)), class = "data.frame", row.names = c(NA, -4L))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213