0

Input:

Type count
manager 123
manager 123
manager 111
manager 222
tech lead 888
tech lead 888
tech lead 888
tech lead 444
developer 234
developer 567
developer 890

Output: want the distinct count of each label i.e manager,techlead, developer

Type count
manager 3
tech lead 2
developer 3
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • `.groupby('Type').count()` ? – Psidom Mar 12 '23 at 18:11
  • @Psidom. Missing `.drop_duplicates(...)`? – Corralien Mar 12 '23 at 19:00
  • What have you tried, and what do you need help with exactly? Like to start, do you know [how to use groupby](https://pandas.pydata.org/docs/user_guide/groupby.html)? For tips, check out [How to ask a good question](/help/how-to-ask). This might also be useful: [How to make good reproducible pandas examples](/q/20109391/4518341). – wjandrea Mar 12 '23 at 19:14

2 Answers2

1

You can use groupby with nunique*:

df.groupby('Type', as_index=False, sort=False)['count'].nunique()
        Type  count
0    manager      3
1  tech lead      2
2  developer      3

* link is currently dead; for now use the docs for 1.4 or dev

wjandrea
  • 28,235
  • 9
  • 60
  • 81
0

To get expected output, you have to drop some duplicates values:

>>> (df.drop_duplicates(['Type', 'count'])
       .value_counts('Type')
       .rename('count').reset_index())

        Type  count
0  developer      3
1    manager      3
2  tech lead      2

>>> (df.drop_duplicates(['Type', 'count'])
       .groupby('Type', as_index=False)['count']
       .count())  # or .nunique(), or .size()

        Type  count
0  developer      3
1    manager      3
2  tech lead      2
Corralien
  • 109,409
  • 8
  • 28
  • 52