I have two data frames, like this:
df1 = pd.DataFrame()
df1['v1'] = [5,7,2,4,9,7,2]
df1['v2'] = ["a1", 'nan', "a2", "a3", "a5", "a6", "a9"]
v1 v2
0 5 a1
1 7 nan
2 2 a2
3 4 a3
4 9 a5
5 7 a6
6 2 a9
and
dfa = pd.DataFrame()
dfa['pc1'] = np.random.rand(5)
dfa['pc2'] = np.random.rand(5)
dfa['idx'] = ["a1", "a2", "a3", "a6", "a9"]
df2 = dfa.set_index('idx')
pc1 pc2
idx
a1 0.048725 0.050773
a2 0.289110 0.302272
a3 0.720966 0.663910
a6 0.021616 0.308114
a9 0.205923 0.583591
df1 has a column v2 that contains character values that match the index of df2. But it also has nan and may contains characters where no corresponding rownames in df2 exists.
I now want to merge these data frames to one, like this:
v1 v2 pc1 pc2
0 5 a1 0.048725 0.050773
1 7 nan nan nan
2 2 a2 0.289110 0.302272
3 4 a3 0.720966 0.663910
4 9 a5 nan nan
5 7 a6 0.021616 0.308114
6 2 a9 0.205923 0.583591
In R
this is very easy using the rownames_to_column(df2, "v2")
and
left_join(df1, .)
functions.
But how can I do it in pandas ?