Pandas - find which id has more than one corresponding value in another column

Question

I have a dataframe df, where I need to find if some player_id has changed its team value along the season:

player_id      team
1         Liverpool
2        Manchester
1         Liverpool
2        Manchester
1         Liverpool
2         Tottenham

To find duplicated pairs I would simply:

duplicate = df[df.duplicated(['player_id', 'team'])]

But the dataframe has duplicated pairs as a rule.

I need to find all cases when one player_id has more than one team, and print:

How so?

Does this answer your question? [Pandas count(distinct) equivalent](https://stackoverflow.com/questions/15411158/pandas-countdistinct-equivalent) — BallpointBen, Dec 08 '20 at 04:21
not really. I don't need all ocurrences, but to pinpoint wich id has two teams — 8-Bit Borges, Dec 08 '20 at 04:27
Would this help? `list(set(df[df.groupby('id')['team'].transform('nunique').ne(1)]['id'].tolist()))` — wwnde, Dec 08 '20 at 04:36

wwnde · Accepted Answer · 2020-12-08T04:34:26.503

1

df[df.groupby('player_id')['team'].transform('nunique').ne(1)]

edited Dec 08 '20 at 04:34

answered Dec 08 '20 at 04:19

wwnde

score 1 · Answer 2 · answered Dec 08 '20 at 04:53

1

You can groupby and calculate the number of unique values per group with the method nunique():

df.groupby('player_id')['team'].nunique().loc[lambda x: x > 1].index

answered Dec 08 '20 at 04:53

Mykola Zotko

2 Answers2