I am sorry if the title isn't the best. I am not sure how to put this in the correct terms.
I am doing some filtering using dpylr. So to give a little background, df1
is a list of all of the human genes. df2
has a list of gene involved in some pathway. The software that gives me the list for df2
doesn't always use the correct gene name that is in df1 so they get skipped when I use this filter
filtered <- df1 %>%
filter(gene.name %in% df2$V1)
So I am missing some of the data that I am interested in. I was wondering if there was a way to compare the new df called filtered
to df2
with some code that marks unique difference? The majority of the filtered
data frame will be the same as df2
but df2
will just have the gene names that were incorrect. The reason I want to do this is because I want to go back and correct the gene names. I df1 and df2 are much larger than the examples so it isn't easy to catch.
Here is an example of what I am saying so maybe it will make more sense
df1
gene.name
ADCY1
ADCY2
ADCY3
ADCY4
df2
gene.name
AC1
ADCY2
AC3
ADCY4
filtered
gene.name
ADCY2
ADCY4