1

I have a dataframe df, where I need to find if some player_id has changed its team value along the season:

player_id      team
1         Liverpool
2        Manchester
1         Liverpool
2        Manchester
1         Liverpool
2         Tottenham

To find duplicated pairs I would simply:

duplicate = df[df.duplicated(['player_id', 'team'])] 

But the dataframe has duplicated pairs as a rule.

I need to find all cases when one player_id has more than one team, and print:

2

How so?

8-Bit Borges
  • 9,643
  • 29
  • 101
  • 198
  • Does this answer your question? [Pandas count(distinct) equivalent](https://stackoverflow.com/questions/15411158/pandas-countdistinct-equivalent) – BallpointBen Dec 08 '20 at 04:21
  • not really. I don't need all ocurrences, but to pinpoint wich id has two teams – 8-Bit Borges Dec 08 '20 at 04:27
  • 1
    Would this help? `list(set(df[df.groupby('id')['team'].transform('nunique').ne(1)]['id'].tolist()))` – wwnde Dec 08 '20 at 04:36

2 Answers2

1
df[df.groupby('player_id')['team'].transform('nunique').ne(1)]
wwnde
  • 26,119
  • 6
  • 18
  • 32
1

You can groupby and calculate the number of unique values per group with the method nunique():

df.groupby('player_id')['team'].nunique().loc[lambda x: x > 1].index
Mykola Zotko
  • 15,583
  • 3
  • 71
  • 73