I have 2 pandas DataFrames (df1
, and df2
) like this.
import pandas as pd
data1 = [['tom', '10'], ['nick', '15'], ['juli', '14']]
df1 = pd.DataFrame(data1, columns=['name', 'id'])
data2 = [['tom', '59'], ['jane', '20'], ['leo', '17']]
df2 = pd.DataFrame(data2, columns=['name', 'id'])
# df1
# name id
# 0 tom 10
# 1 nick 15
# 2 juli 14
# df2
# name id
# 0 tom 59
# 1 jane 20
# 2 leo 17
How can I merge them into the following DataFrame?
data_merged = [['tom', '10', 'tom', '59'], ['nick', '15', '', ''], ['juli', '14', '', ''],
['', '', 'jane', '20'], ['', '', 'leo', '17']]
df_merged = pd.DataFrame(data_merged, columns=['name_1', 'id_1', 'name_2', 'id_2'])
# df_merged
# name_1 id_1 name_2 id_2
# 0 tom 10 tom 59
# 1 nick 15
# 2 juli 14
# 3 jane 20
# 4 leo 17
The rule of merge is as follows:
If the content of the name
column in df1
and df2
are identical, they would appear at the same row in df_merged
.
Otherwise, place the data in different rows of df_merged
.