Combine two data frames and remove duplicate columns

Question

I want to cbind two data frames and remove duplicated columns. For example:

df1 <- data.frame(var1=c('a','b','c'), var2=c(1,2,3))
df2 <- data.frame(var1=c('a','b','c'), var3=c(2,4,6))

cbind(df1,df2) #this creates a data frame in which column var1 is duplicated

I want to create a data frame with columns var1, var2 and var3, in which column var2 is not repeated.

score 12 · Accepted Answer · answered Sep 16 '11 at 07:16

12

merge will do that work.

try:

merge(df1, df2)

answered Sep 16 '11 at 07:16

kohske

65,572
8
165
155

3

There is no contradiction with the example in question, but should there be deviating values in var1, those cases would be deleted with merge; e.g. try `df2<-data.frame(var1=c('a','b','d'),var3=c(2,4,6))`. This is relevant for cases where variable names are duplicated, but the respective data is not. – Maxim.K Oct 17 '14 at 13:51

score 1 · Answer 2 · answered Dec 06 '13 at 13:56

In case you inherit someone else's dataset and end up with duplicate columns somehow and want to deal with them, this is a nice way to do it:

for (name in unique(names(testframe))) {
  if (length(which(names(testframe)==name)) > 1) {
    ## Deal with duplicates here. In this example
    ## just print name and column #s of duplicates:
    print(name)
    print(which(names(testframe)==name))
  }
}

score 1 · Answer 3 · answered Sep 21 '22 at 14:34

The function mutate in dplyr can take two dataframes as arguments and all columns in the second dataframe will overwrite existing columns in the first dataframe. Columns that don't exist in the first dataframe will be constructed in the new dataframe.

> mutate(df1,df2)
   var1 var2 var3
 1    a    1    2
 2    b    2    4
 3    c    3    6

Combine two data frames and remove duplicate columns

3 Answers3

Linked

Related