I have two dataframes of Premier League soccer players:
df1:
ID Player Team Pos
1 Gabriel Dos Santos Arsenal DF
218 Conor Gallagher Crystal Palace MF
396 Gabriel Jesus Manchester City FW
df2:
ID name team minutes
15 Gabriel dos Santos Magalhães Arsenal 3063
18 Gabriel Martinelli Silva Arsenal 1855
27 Gabriel Fernando de Jesus Arsenal 1871
I want to merge the dataframes by name/player and keep ALL rows and columns of d1 and d2 even if a name is not in both dataframes. It would look something like this:
ID name team minutes ID Pos Team
15 Gabriel dos Santos Magalhães Arsenal 3063 1 DF Arsenal
18 Gabriel Martinelli Silva Arsenal 1855 NA NA NA
27 Gabriel Fernando de Jesus Arsenal 1871 396 FW Manchester City
NA Conor Gallagher NA NA 218 MF Crystal Palace
The only problem is that the names in d1 do not exactly match the names in d2 (think of d1 as a partial name, or substring of d2 names), and some of the names in d1 are not in d2 (and viceversa).
I did this:
d2[d2['name'].apply(lambda player: d1['Player'].str.contains(player)).any(1)]
But it's not working. What should I do?