I have a data.frame ids
with an index column of person ID number.
ID
PA1
PA2
PA3
PA4
PA5 etc
I want to extract the age and sex of each person from 2 other data.frames, df1
and df2
which have the details (with different headers), each contains only part of the total sample population, though there maybe some overlap, but also ids
contains more rows than both df1
and df2
combines:
df1
ID Sex Age Mode
PA1 M 34 HD
PA3 F 45 PD
and df2
ID Gender Age
PA4 M 67
PA3 F 45
PA2 M 65
In total there are about 1700 rows of data. I want to get the Sex
or Gender
and Age
from df1
and df2
and put them with the correct ID
in new columns in ids
data.frame, so I get this:
ID Age Sex
PA1 34 M
PA2 65 M
PA3 etc...
PA4
PA5
df1
and df2
both contain other columns of data which I don't need in ids
so I don't think I want to use left_join or merge.
I was considering doing something like this with dlplyr
:
ids = ids%>%
mutate("Sex" = case_when(df1$Gender != "" ~df1$Gender, TRUE ~"" ))
and if blank to look in df2
but realise that won't match the values with the correct IDs. The IDs are not currently in the same order in the different data.frames