I have a data frame df1
that contains the rows of another data frame df2
(I only have one row in this question for a simpler visualisation of the problem). The data in df1
is organised in pairs according to Cat
. What I need to do is use the rows from df2
to make a data frame with both rows of each pair. That is, the end goal is to have a data frame with only the pairs from df1
as shown in dfgoal
.
df1 <- data.frame(Source = c("1", "1", "1", "1", "1", "1"),
Order = c("1", "2", "3", "4", "5", "6"),
Name = c("A", "A", "A", "A", "A", "A" ),
Cat = c("A1", "A1", "A2", "A2", "A3", "A3"),
Vol = c("35", "43", "none", "none", "63", "24"))
df2 <- data.frame(Source = c("1"),
Order = c("4"),
Name = c("A" ),
Cat = c("A2"),
Vol = c("none"))
dfgoal <- data.frame(Source = c("1", "1"),
Order = c("3", "4"),
Name = c("A", "A"),
Cat = c("A2", "A2"),
Vol = c("none", "none"))
My original data frames are much larger with more variables so I was hoping that there was a way to use the rows from df2
for filtering purposes in df1
so I don't need to specify column values that I can see from df2
. I tried using the group_by function from the dplyr package in the hopes that the paired row (Order == "3"
in this example) would stay grouped when joining the identical rows from df1
and df2
. Needless to say it didn't work. I ended up with only the row from df2
.
df_attempt <- df1 %>% group_by(Source, Name, Cat) #dplyr package
df_attempt <- inner_join(df_attempt, df2) #dplyr package