Suppose I have two data frames as following:
df1 = pd.DataFrame([('k01', 'p01'), ('k02', 'p02'), ('k03', 'p03')], columns=['kid_id','parent_id'])
df2 = pd.DataFrame([('k01', 'andy'), ('p03', 'kyle'), ('k03', 'danny'), ('p01', 'bob'), ('p02', 'karen'), ('k02', 'teddy'), ('p04', 'sussy'), ('k05', 'emily')], columns=['id','name'])
I want to add two new columns, kid_name
and parent_name
to df1
, something like this, by looking up the id
in df2
and return name
:
df1[['kid_name', 'parent_name']] = df2.....
How to do this in one clean line?
Update: following @Anurag Dabas's suggestion below, here is what I come up with:
df3 = df1.merge(df2, left_on='kid_id', right_on='id', how='left')\
.rename(columns={'name':'kid_name'}).drop('id', axis=1)\
.merge(df2, left_on='parent_id', right_on='id', how='left')\
.rename(columns={'name':'parent_name'}).drop('id', axis=1)
Still, that seems awfully messy (what if I need to do this for more than 2 columns). If anyone can improve this, thanks?
Expected output:
kid_id parent_id kid_name parent_name
0 k01 p01 andy bob
1 k02 p02 teddy karen
2 k03 p03 danny kyle