My goal is to merge two large dataframes based on column genus
, but with the special condition of not duplicating rows (not solved in first try); and also preserving more information from both dataframes (not solved in second try), please see desired output:
chromdata <- read.table(text="
genus sp
1 Acosta Acosta_1
2 Aguilera Aguilera_1
3 Acosta Acosta_2
4 Aguilera Aguilera_2
5 other 1 # EDIT: new rows
6 other 2",header=TRUE,fill=TRUE,stringsAsFactors=FALSE)
treedata <- read.table(text="
genus sp
1 Acosta Acosta_3
2 Aguilera Aguilera_3
3 Acosta Acosta_4
4 Aguilera Aguilera_4
5 other 3",header=TRUE,fill=TRUE,stringsAsFactors=FALSE)
#First try
merge(chromdata,treedata, by="genus", all=F)
#Second try
chromdata$sp2<-treedata$sp[match(chromdata$genus, treedata$genus)]
chromdata
genus sp sp2
1 Acosta Acosta_1 Acosta_3
2 Aguilera Aguilera_1 Aguilera_3
3 Acosta Acosta_2 Acosta_3 #Acosta_4 missing
4 Aguilera Aguilera_2 Aguilera_3 # Aguilera_4 missing
5 other 1 3
6 other 2 3
Desired Output:
genus sp sp2
1 Acosta Acosta_1 Acosta_3
2 Aguilera Aguilera_1 Aguilera_3
3 Acosta Acosta_2 Acosta_4
4 Aguilera Aguilera_2 Aguilera_4
5 other 1 3 # EDIT: new rows
6 other 2 3