The objective of the code below is to implement a FULL OUTER JOIN using three pandas data frames. All records of all data frames should be printed, and if there's a relationship between two or three records, then they should be printed in the same line.
The fields to use to relate the data frames are type_1
and id_1
in the first data frame, type_2
and id_2
in the second data frame and type_3
and id_3
in the third data frame.
The problem is that the relationship between the second and third data frames is not working. Take a look at the case in line 11 and 13, it should be a single line because type_2
= type_3
and id_2
= id_3
. The expected output is in line 11 11 NaN NaN NaN 7.0 8 KoKo 7.0 8 Kuku
and line 13 shouldn't be printed. How to fix this?
import pandas as pd
raw_data = {
'type_1': [0, 1, 1, 2, 2],
'id_1': ['3', '4', '5', '3', '3'],
'name_1': ['Alex', 'Amy', 'Allen', 'Peter', 'Liz']}
df_a = pd.DataFrame(raw_data, columns = ['type_1', 'id_1', 'name_1' ])
raw_datab = {
'type_2': [1, 1, 1, 0,7],
'id_2': ['4', '5', '5', '7', '8'],
'name_2': ['Billy', 'Brian', 'Joe', 'Bryce', 'KoKo']}
df_b = pd.DataFrame(raw_datab, columns = ['type_2', 'id_2', 'name_2'])
raw_datac = {
'type_3': [1, 1, 1, 1, 2, 2, 7],
'id_3': ['4', '6', '5', '5', '3', '3','8'],
'name_3': ['School', 'White', 'Jane', 'Homer', 'Paul', 'Lorel', 'Kuku']}
df_c = pd.DataFrame(raw_datac, columns = ['type_3', 'id_3', 'name_3'])
merged = df_a
merged = merged.merge(df_b, how='outer', left_on=['type_1', 'id_1'],
right_on=['type_2', 'id_2'])
merged = merged.merge(df_c, how='outer', left_on=['type_1', 'id_1'],
right_on=['type_3', 'id_3'])
print(merged)
Result:
type_1 id_1 name_1 type_2 id_2 name_2 type_3 id_3 name_3
0 0.0 3 Alex NaN NaN NaN NaN NaN NaN
1 1.0 4 Amy 1.0 4 Billy 1.0 4 School
2 1.0 5 Allen 1.0 5 Brian 1.0 5 Jane
3 1.0 5 Allen 1.0 5 Brian 1.0 5 Homer
4 1.0 5 Allen 1.0 5 Joe 1.0 5 Jane
5 1.0 5 Allen 1.0 5 Joe 1.0 5 Homer
6 2.0 3 Peter NaN NaN NaN 2.0 3 Paul
7 2.0 3 Peter NaN NaN NaN 2.0 3 Lorel
8 2.0 3 Liz NaN NaN NaN 2.0 3 Paul
9 2.0 3 Liz NaN NaN NaN 2.0 3 Lorel
10 NaN NaN NaN 0.0 7 Bryce NaN NaN NaN
11 NaN NaN NaN 7.0 8 KoKo NaN NaN NaN
12 NaN NaN NaN NaN NaN NaN 1.0 6 White
13 NaN NaN NaN NaN NaN NaN 7.0 8 Kuku