0

I have this dataframe, df_players, where player id occurs many times throught many rounds and is thus repeated:

       player_id    player_name   position          team      round  player_points
0          74159  Abel Hernández  forward  Internacional         30          -0.1
1          74159  Abel Hernández  forward  Internacional         17           4.0
2          74159  Abel Hernández  forward  Internacional         18          11.0
3          74159  Abel Hernández  forward  Internacional         19          -0.4
4          74159  Abel Hernández  forward  Internacional         20           5.7
...

Also I have this df_teams:

            team    team_points     round
0    Internacional         69.4        30
1    Internacional         10.1        17
2    Internacional         24.3        20
3    Internacional         16.5        18
4    Internacional         33.5        19
...

And I need to bring 'team_points' value for each 'round' from df_teams into df_players, ending up with:

       player_id    player_name   position          team      round  player_points  team_points
0          74159  Abel Hernández  forward  Internacional         30          -0.1     69.4
1          74159  Abel Hernández  forward  Internacional         17           4.0     10.1
2          74159  Abel Hernández  forward  Internacional         18          11.0     16.5
3          74159  Abel Hernández  forward  Internacional         19          -0.4     33.5
4          74159  Abel Hernández  forward  Internacional         20           5.7     24.3

I have tried mapping with:

df_players['team_points'] = df_players['round'].map(df_teams.set_index('round')['team_points'])

But this throws the error:

pandas.core.indexes.base.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

How do I achieve this?

SeaBean
  • 22,547
  • 3
  • 13
  • 25
8-Bit Borges
  • 9,643
  • 29
  • 101
  • 198

1 Answers1

1

Use df.merge() like this:

df_players.merge(df_teams, on=['team', 'round'])

This performs an 'inner' join by default to merge only those appear in both tables on the 2 merge fields ['team', 'round'].

Input: df_players (added a new player at the end):

       player_id    player_name   position          team      round  player_points
0          74159  Abel_Hernández  forward  Internacional         30          -0.1
1          74159  Abel_Hernández  forward  Internacional         17           4.0
2          74159  Abel_Hernández  forward  Internacional         18          11.0
3          74159  Abel_Hernández  forward  Internacional         19          -0.4
4          74159  Abel_Hernández  forward  Internacional         20           5.7
5          74160  John_Hernández  forward  Internacional         18           7.7

Output:

   player_id     player_name position           team  round  player_points  team_points
0      74159  Abel_Hernández  forward  Internacional     30           -0.1         69.4
1      74159  Abel_Hernández  forward  Internacional     17            4.0         10.1
2      74159  Abel_Hernández  forward  Internacional     18           11.0         16.5
3      74160  John_Hernández  forward  Internacional     18            7.7         16.5
4      74159  Abel_Hernández  forward  Internacional     19           -0.4         33.5
5      74159  Abel_Hernández  forward  Internacional     20            5.7         24.3
SeaBean
  • 22,547
  • 3
  • 13
  • 25
  • The output on player_id is a little bit distorted for I copied your data since it has 2 words. Let me fix it – SeaBean Mar 16 '21 at 16:05
  • I added a new player_id 74160 just to test it works for more than one player. It now appears together with round 18. You can further sort by player_id if you like. – SeaBean Mar 16 '21 at 16:12