Merging two data frames, while keeping uncommon rows between them using R

Question

I have two dfs as below and want to merge. I want to merge these two dfs based on the family column and add the count for each gene without removing the genes in the final df if they are not common between the first df and the second one.

#first df
Family <- c("LET-7","LET-7","LET-7","MIR-10","MIR-103","MIR-124","MIR-124","MIR-124")
Sequence <- c("ATCGGCA","ATGCTAC","ATCGGCA","ATCGTTT","TGAGGAG","TGATCAG","AATTCAG","AATTCAG")
my_data_frame <- data.frame(Family,Sequence)

#second df
counts <- c("2","3")
Family <- c("LET-7","MIR-124")
countdf <- data.frame(Family,counts)

the output that I want to have

Family <- c("LET-7","LET-7","LET-7","MIR-10","MIR-103","MIR-124","MIR-124","MIR-124")
Counts <- c("2","2","2","0","0","3","3","3")
Sequence <- c("ATCGGCA","ATGCTAC","ATCGGCA","ATCGTTT","TGAGGAG","TGATCAG","AATTCAG","AATTCAG")
newdf <- data.frame(Family,Counts,Sequence)

You might want to initialize your data.frames with `stringsAsFactors = FALSE` , that prevents some inconveniences later on — SebSta, Feb 07 '20 at 08:54

dario · Accepted Answer · 2020-02-07T20:14:57.367

1

Solution using package dplyr

library(dplyr)
newdf_dplyr <- my_data_frame %>% 
  left_join(countdf)

Solution using base R:

newdf_base <- merge(my_data_frame, countdf, by="Family", all.x=TRUE)

edited Feb 07 '20 at 20:14

answered Feb 07 '20 at 08:50

dario

6,415
2
12
26

Merging two data frames, while keeping uncommon rows between them using R

1 Answers1