If I have two dataframes, as below.
df_1
:
id id_type
100 atype
101 atype
102 atype
603 another
604 another
605 another
and
df_2
:
id_1 id_2 id_3
100 600 200
101 601 200
102 602 200
103 603 300
104 604 400
105 605 400
And I want to unify these to to use the second type ID starting with 600. OBVIOUSLY, the real ID differences are not just "add 500 to id_1
to get id_2
", this is just for easy demonstration.
So I want to create a new unified_id
column, where I only use the 600-type IDs. So the output would be:
df_1
:
id id_type unified_id
100 atype 600
101 atype 601
102 atype 602
603 another 603
604 another 604
605 another 605
So I have a conditional new column creation, where the condition is "if the type is 'another', keep the ID from the current DF, but if not, read the ID from df_2
". This is different than stuff like this for example, as the data doesn't need to be filled by a condition and a number of fixed values, but a condition and values that need to be read from either the current or another df.
How can I achieve this?
[EDIT]
No, this is NOT covered in merging 101. This is a calculated column question, not a merging one. Merging doesn't combine half a column from one df and another from another. I don't know how this is not clear. If I merge on left - id and right - id_1, I get a new column that's half correct and half NaN. That's not what I want. I need to conditionally read certain values from 1 df and another from the current.