I have two dataframe namely df1
and df2
. Both has around 30M rows so its pretty large.
The df1
has 3 columns:
A B C
where df2
has two columns A D
.
The column A
is mostly same but is unordered and has some values. For example df2['A']
missing some values compare to df1['A']
.
what I need
I need to merge two dataframe so that df1
has column D
but for those missing values, I need it to be NaN
.
Example
df1
A B C
x 1 2
y 5 6
z 4 4
df2
A D
x 10
z 20
Merged Dataframe
A B C D
x 1 2 10
y 5 6 NaN
z 4 4 20
What I have tried
I tried using df1.merge(df2, how='right', on="A")
but couldn't figure out how to make those missing values NaN
. I also tried to iterate over the rows of df2
and add the column D
values to df1
but it will take days to finish since the both dataframes are big.