I have two dataframes of different lengths, and I am trying to use the data of one df to complete the data from the second dataframe. My df1 is a list of projects and organizations participating in these projects, with various information, and my df2 is a list of all organizations involved in these projects and some additional informations that I want to add to df1 to have a more complete picture.
For instance, a shorter example of my inputs:
df1
project_id org_funds project_year org_ID
1 100 2016 11
1 500 2016 13
2 200 2020 11
2 150 2020 12
2 350 2020 13
df2
org_ID org_loc
11 USA
12 NL
13 CN
I am trying to add the info from the column org_loc to df1, so that for every org_ID that matches in the two df, I can add the corresponding org_loc information.
My expected output would be:
df1
project_id org_funds project_year org_ID org_loc
1 100 2016 11 USA
1 500 2016 13 CN
2 200 2020 11 USA
2 150 2020 12 NL
2 350 2020 13 CN
I cannot do it manually, as my df1 have thousands of rows and dozens of columns.
I tried several things, like a for loop, a combination of for loop and if function, and of mutate function and if function. Nothing worked so far because as I understood it, these all require that the two df have the same length which is not the case here.
For instance:
df1$org_loc <- 1
if ((df1$org_ID[]) == (df2$org_ID[])){
df1$org_loc[] <- df2$org_loc[]
}
which returned 2 errors : "the condition has length > 1" and "longer object length is not a multiple of shorter object length"
or another thing I tried:
df1 <- df1 %>%
mutate(org_loc = case_when(df1$org_ID == df2$org_ID ~ df2$org_loc ))
which returns "error in mutate() Caused by error in case_when()
:
! Can't recycle ..1 (left)
(size 7628) to match ..1 (right)
(size 3766)."
(7628 is the size of my df1, 3766 the size of my df2)
I've been searching for days and I didn't find any solution that allows me to do this with two objects of different lengths.
Could any of you have any solution to my problem? Thanks!